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The  NASA  STI  Program  Office  ...in  Profile 


Since  its  founding,  NASA  has  been  dedicated 
to  the  advancement  of  aeronautics  and  space 
science.  The  NASA  Scientific  and  Technical 
Information  (STI)  Program  Office  plays  a  key 
part  in  helping  NASA  maintain  this 
important  role. 

The  NASA  STI  Program  Office  is  operated  by 
Langley  Research  Center,  the  lead  center  for 
NASA’s  scientific  and  technical  information. 
The  NASA  STI  Program  Office  provides 
access  to  the  NASA  STI  Database,  the 
largest  collection  of  aeronautical  and  space 
science  STI  in  the  world.  The  Program  Office 
is  also  NASA’s  institutional  mechanism  for 
disseminating  the  results  of  its  research  and 
development  activities.  These  results  are 
published  by  NASA  in  the  NASA  STI  Report 
Series,  which  includes  the  following  report 
types: 

•  TECHNICAL  PUBLICATION.  Reports  of 
completed  research  or  a  major  significant 
phase  of  research  that  present  the  results 
of  NASA  programs  and  include  extensive 
data  or  theoretical  analysis.  Includes 
compilations  of  significant  scientific  and 
technical  data  and  information  deemed 

to  be  of  continuing  reference  value.  NASA 
counter-part  or  peer-reviewed  formal 
professional  papers,  but  having  less 
stringent  limitations  on  manuscript 
length  and  extent  of  graphic 
presentations. 

•  TECHNICAL  MEMORANDUM. 
Scientific  and  technical  findings  that  are 
preliminary  or  of  specialized  interest, 
e.g.,  quick  release  reports,  working 
papers,  and  bibliographies  that  contain 
minimal  annotation.  Does  not  contain 
extensive  analysis. 

•  CONTRACTOR  REPORT.  Scientific  and 
technical  findings  by  NASA-sponsored 
contractors  and  grantees. 


•  CONFERENCE  PUBLICATIONS. 
Collected  papers  from  scientific  and 
technical  conferences,  symposia, 
seminars,  or  other  meetings  sponsored  or 
co-sponsored  by  NASA. 

•  SPECIAL  PUBLICATION.  Scientific, 
technical,  or  historical  information  from 
NASA  programs,  projects,  and  missions, 
often  concerned  with  subjects  having 
substantial  public  interest. 

•  TECHNICAL  TRANSLATION.  English- 
language  translations  of  foreign  scientific 
and  technical  material  pertinent  to 
NASA’s  mission. 

Specialized  services  that  help  round  out  the 
STI  Program  Office’s  diverse  offerings  include 
creating  custom  thesauri,  building  customized 
databases,  organizing  and  publishing 
research  results  . . .  even  providing  videos. 

For  more  information  about  the  NASA  STI 
Program  Office,  you  can: 

•  Access  the  NASA  STI  Program  Home 
Page  at  http://www.sti.nasa.gov/STI- 
homepage.html 

•  Email  your  question  via  the  Internet  to 
help  @  sti.nasa.gov 

•  Fax  your  question  to  the  NASA  Access 
Help  Desk  at  (301)  621-0134 

•  Phone  the  NASA  Access  Help  Desk  at 
(301)  621-0390 

•  Write  to: 

NASA  Access  Help  Desk 
NASA  Center  for  AeroSpace  Information 
7121  Standard  Drive 
Hanover,  MD  21076-1320 
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INTRODUCTION 


The  Institute  for  Computer  Applications  in  Science  and  Engineering  (ICASE)*  is  operated  at  the  Langley 
Research  Center  (LaRC)  of  NASA  by  the  Universities  Space  Research  Association  (USRA)  imder  a  contract 
with  the  Center.  USRA  is  a  nonprofit  consortium  of  major  U.  S.  colleges  and  universities. 

The  Institute  conducts  uncla^ssified  basic  research  in  applied  mathematics,  numerical  analysis  and  algo¬ 
rithm  development,  fluid  mechanics,  and  computer  science  in  order  to  extend  and  improve  problem-solving 
capabilities  in  science  and  engineering,  particularly  in  the  areas  of  aeronautics  and  space  research. 

ICASE  has  a  small  permanent  staff.  Research  is  conducted  primarily  by  visiting  scientists  from  univer¬ 
sities  and  industry  who  have  resident  appointments  for  limited  periods  of  time  as  well  as  by  visiting  and 
resident  consultants.  Members  of  NASA’s  research  staff  may  also  be  residents  at  ICASE  for  limited  periods. 

The  major  categories  of  the  current  ICASE  research  program  are: 

•  Applied  and  numerical  mathematics,  including  multidisciplinary  design  optimization; 

•  Theoretical  and  computational  research  in  fluid  mechanics  in  selected  areas  of  interest  to  LaRC, 
such  as  transition,  turbulence,  flow  control,  and  acoustics;  and 

•  Applied  computer  science:  system  software,  systems  engineering,  and  parallel  algorithms. 

ICASE  reports  are  primarily  considered  to  be  preprints  of  manuscripts  that  have  been  submitted  to 

appropriate  research  journals  or  that  are  to  appear  in  conference  proceedings.  A  list  of  these  reports  for  the 
period  October  1,  1998  through  March  31,  1999  is  given  in  the  Reports  and  Abstracts  section  which  follows 
a  brief  description  of  the  research  in  progress. 


*  ICASE  is  operated  at  NASA  Langley  Research  Center,  Hampton,  VA,  under  the  National  Aeronautics  and  Space  Adminis¬ 
tration,  NASA  Contract  No.  NASl-97046.  Financial  support  was  provided  by  NASA  Contract  Nos.  NASl-97046,  NASl-19480, 
NASl-18605,  NASl-18107,  NASl-17070,  NASl-17130,  NASl-15810,  NASl-16394,  NASl-14101,  and  NASI- 14472. 
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RESEARCH  IN  PROGRESS 


APPLIED  AND  NUMERICAL  MATHEMATICS 


BRIAN  ALLAN 

Closed-loop  Separation  Control  Using  Oscillatory  Flow  Excitation 

Experimental  results  have  shown  that  oscillatory  blowing,  introduced  upstream  of  a  separated  boundary 
layer,  can  effectively  delay  boundary  layer  separation.  This  method  for  separation  control  can  be  used  for 
several  different  flow  control  problems.  Some  of  the  current  research  areas  of  separation  control  are  high  lift 
enhancement  and  maneuver  control.  This  project  will  develop  a  feedback  controller  which  will  control  the 
amount  of  separation  in  the  boundary  layer.  The  feedback  controller  designed  will  then  be  used  in  a  wind 
tunnel  test  on  an  airfoil  model  with  oscillatory  blowing. 

Currently  we  have  a  design  methodology  for  the  feedback  controller  using  a  robust  control  design  method. 
The  controller  is  designed  to  track  a  desired  pressure  gradient  in  the  separated  boundary  layer.  We  will  not 
know  what  the  dynamics  of  the  flow  system  are  until  the  wind  tunnel  tests  are  conducted.  When  the 
wind  tunnel  tests  start,  we  will  be  able  to  get  an  accurate  model  of  the  system.  This  model  will  then  be 
used  in  our  current  controller  design  methodology.  We  are  also  building  a  hardware  interface  for  the  flow 
experiment  which  will  provide  the  feedback  control  to  the  experiment.  This  hardware  interface  is  designed 
to  be  transferable  to  other  flow  control  experiments  at  NASA  Langley. 

This  control  design  method  and  hardware  interface  are  scheduled  to  be  tested  on  a  future  wind  tunnel 
test.  The  hardware  interface  and  control  design  experience  gained  from  this  project,  will  be  transferred  to 
other  flow  control  experiments  at  NASA  Langley. 

This  research  was  conducted  in  collaboration  with  Jer-Nan  Juang  (NASA  Langley),  David  Raney  (NASA 
Langley),  and  Avi  Seifert  (NRC). 

EYAL  ARIAN 

Approximations  of  the  Newton  Step  for  Large-scale  Optimization  Problems 

Quasi-Newton  methods  for  large-scale  optimization  problems  are  powerful  but  suffer  an  initial  slow 
convergence  rate.  Our  goal  is  to  develop  a  new  iterative  method,  for  the  solution  of  large-scale  optimization 
problems,  that  will  allow  a  better  approximation  for  the  Newton  step  right  from  the  first  optimization  steps. 

In  the  course  of  the  optimization  process,  systems  of  linear  equations  are  constructed  that  contain  the 
linearized  state  operator  and  its  adjoint.  These  have  to  be  solved  at  each  iteration  to  achieve  convergence 
of  the  iterates  to  the  Newton  step.  We  are  investigating  a  defect-correction  method  to  solve  these  systems 
of  equations  for  highly  ill-conditioned  problems  with  many  design  variables.  Preliminary  numerical  tests  on 
the  potential  small  disturbance  shape  optimization  problem  are  promising. 

Our  plan  is  to  further  investigate  the  above  method  for  applications  that  are  governed  by  nonlinear 
equations.  This  approach  can  be  naturally  embedded  in  a  SQP  formulation  of  the  problem. 

This  research  was  conducted  in  collaboration  with  A.  Battermann  and  E.  Sachs  (Universitat  Trier, 
Germany). 
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Large-scale  Aerodynamic  Shape  Optimization 

The  purpose  of  this  work  is  to  develop  and  apply  algorithms  which  do  not  require  more  than  a  few  full 
solutions  of  the  flow  equations  to  obtain  the  optimmn. 

Our  approach  is  to  apply  approximations  in  the  PDE  level  to  the  numerical  solution  of  a  practical 
large-scale  optimization  problem.  We  are  working  on  shape  optimization  of  a  3D  geometry  using  TLNS3D. 

This  research  was  conducted  in  collaboration  with  V.  Vatsa  (NASA  Langley). 

H.T.  BANKS 

Electromagnetic  Interrogation  of  Structures 

The  detection  and  characterization  of  subsurface  damage  (cracks,  internal  corrosions,  etc.)  is  an  impor¬ 
tant  problem  in  aging  structures  such  as  airfoils,  etc.  In  collaboration  with  scientists  in  the  Nondestructive 
Evaluation  Branch  at  NASA,  we  are  developing  computational  techniques  for  inverse  problems  involving 
electromagnetic  interrogation  of  structures  using  superconducting  quantum  interference  devices  (SQUIDs). 

Our  approach  is  to  develop  reduced-order  model  computational  methods  for  Maxwell’s  equations  in  a 
dielectric  medium  to  be  used  in  inverse  algorithms.  To  date,  we  have  developed  models  based  on  eddy  current 
interrogation  of  structures.  These  models  are  being  tested  using  a  full  Maxwell  solver  in  ANSOFT  which 
computes  time-dependent  fields  in  terms  of  a  vector  magnetic  potential  A  in  phaser  form.  Our  reduced-order 
methods  are  based  on  Karhunen-Loeve  or  Proper  Orthogonal  Decomposition  (POD)  methods. 

We  have  made  significant  progress  on  the  modeling  and  computational  aspects  of  this  problem  and  are 
currently  testing  our  ideas  with  simulated  SQUID  data  for  model  verification  and  assessment  of  the  abihty 
to  identify  and  characterize  damage  geometries  in  a  structure. 

BORIS  DISKIN 

Efficient  Methods  for  Solving  Upwind-biased  Discretizations  of  Advection  Equation 

The  eflSciency  of  methods  for  solving  the  advection  equation  is  extremely  important  in  devising  solvers 
for  comphcated  computational  fluid  dynamics  problems.  Frequently,  the  overall  convergence  rate  of  a  sophis¬ 
ticated  solver  is  determined  by  the  convergence  in  a  build-in  algorithm  solving  the  advection  equation.  The 
simplest  way  to  solve  the  advection  operator  is  to  employ  downstream  marching.  If  the  corresponding  dis¬ 
cretization  is  a  stable  upwind  discretization  and  the  field  of  velocities  does  not  recirculate,  then  this  marching 
proves  to  be  a  very  efficient  solver  yielding  an  accurate  solution  to  a  discretized  nonlinear  advection  equa¬ 
tion  in  just  a  few  sweeps  (a  single  downstream  sweep  provides  the  exact  solution  to  a  linearized  problem). 
However,  if  a  discretization  of  the  advection  operator  is  not  fully  upwind  (e.g.,  only  upwind-biased)  then 
marching  in  its  pure  form  is  inapphcable  and  other  solution  methods  should  be  employed.  In  this  period,  we 
systematically  studied  two  methods  for  solving  upwind-biased  discretizations  of  the  advection  operator:  the 
defect-correction  method  and  the  multigrid  method  using  semicoarsening.  This  research  was  motivated  by 
the  search  for  an  explanation  of  convergence  properties  of  an  existing  full  Euler  system  solver  and  also  by 
the  wish  to  extend  the  range  of  available  advection  solvers  taking  into  account  parallelization  perspectives. 

The  defect-correction  and  multigrid  methods  have  been  analyzed  in  application  to  discretized  advection 
equations  corresponding  to  flow  at  some  angle  of  attack  to  a  uniform  Cartesian  grid.  We  have  developed  a 
novel  comprehensive  mode  analysis.  This  analysis  predicts  the  convergence  rate  for  each  iteration  and  the 
asymptotic  convergence  rate.  On  the  base  of  this  analysis,  we  have  explained  many  surprising  details  observed 
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in  numerical  calculations  (e.g.,  establishment  of  a  good  asymptotic  convergence  rate  after  many  poorly 
converging  defect-correction  iterations  and  fast  convergence  in  a  multigrid  cycle  employing  semicoarsening 
far  surpassing  the  theoretical  limit  predicted  for  standard  multigrid  algorithms  using  full  coarsening).  It 
has  been  found,  analytically  and  experimentally,  that  the  convergence  properties  of  the  defect-correction 
iterations  are  grid-dependent.  The  number  of  iterations  required  to  converge  algebraic  error  below  the 
truncation  error  level  might  grow  on  fine  grids  as  a  negative  power  of  the  mesh  size.  On  the  contrary,  the 
efiiciency  of  the  multigrid  algorithm  does  not  deteriorate  with  increasing  the  cycle  depth  (number  of  levels) 
and/or  refining  the  target-grid  mesh.  This  multigrid  method  uses  colored  relaxation  schemes  on  all  the  grids 
and,  therefore,  is  very  attractive  for  parallel  computing.  A  new  very  efl&cient  adaptive  multilevel  approach 
to  deriving  a  discrete  solution  approximating  the  true  continuous  solution  within  a  given  relative  accuracy 
is  developed.  This  approach  was  tested  for  both  the  defect-correction  and  multigrid  methods. 

As  an  additional  option,  we  are  going  to  analyze  the  predictor-corrector  method  for  solving  upwind- 
biased  discretizations.  We  also  plan'  to  implement  some  of  the  proposed  ideas  in  the  framework  of  the 
existing  3D  full  Euler  system  solver. 

This  research  was  conducted  in  collaboration  with  J.L.  Thomas  (NASA  Langley). 

JAN  S.  HESTHAVEN 

Well-posed  Perfectly  Matched  Layers  for  Advective  Acoustics 

The  abihty  to  simulate  accurate  wave  phenomena  is  important  in  several  physical  fields,  e.g.,  electro¬ 
magnetics,  ambient  acoustics,  advective  acoustics  associated  with  a  mean  flow,  elasticity,  and  seismology. 

Often  the  numerical  simulations  of  such  problems,  due  to  limited  computing  resources,  must  be  confined 
to  truncated  domains  much  smaller  than  the  physical  space  over  which  the  wave  phenomena  takes  place.  In 
such  cases,  numerical  reflections  of  outgoing  waves  from  the  boundaries  of  the  numerical  domain  can  falsify 
the  computational  results.  This  artifact  limits  the  overall  order  of  accuracy  of  the  algorithm  used  in  the 
computation.  This  is  particularly  troublesome  in  cases  where  higher-order  of  accuracy  is  required  by  mode 
resolution,  storage  availabihty,  etc. 

Utihzing  a  mathematical  framework  created  for  the  development  of  perfectly  matched  layer  (PML) 
schemes  within  computational  electromagnetics,  we  have  developed  a  set  of  strongly  well-posed  PML  equa¬ 
tions  for  the  absorption  of  acoustic  and  vorticity  waves  in  two-dimensional  convective  acoustics  under  the 
assumption  of  a  spatially  constant  mean  flow. 

A  central  piece  in  this  formulation  is  the  development  of  a  variable  transformation  that  conserves  the  dis¬ 
persion  relation  of  the  physical  space  equations.  The  PML  equations  are  given  for  layers  being  perpendicular 
to  the  direction  of  the  mean  flow  as  well  as  for  layers  aligned  parallel  to  the  mean  flow. 

The  efficacy  of  the  PML  scheme  has  been  tested  by  solving  the  equations  of  acoustics  using  a  fourth-order 
scheme,  confirming  the  accuracy  as  well  as  stability  of  the  proposed  schemes. 

The  development  of  a  PML  for  the  three-dimensional  equations  of  acoustics  is  straightforward  provided 
only  that  the  mean  flow  can  be  considered  spatially  constant.  Of  equal  importance,  however,  is  the  develoj)- 
ment  of  PML  methods  for  problems  involving  smoothly  varying  mean  flows,  as  in  boundary  layers  and  jets. 
While  the  mathematical  tools  developed  so  far  certainly  are  applicable  for  sufficiently  smooth  variations, 
new  developments  are  most  hkely  needed  to  address  the  general  variable  coefficient  problem  and  we  hope  to 
address  these  questions  in  the  near  future. 

This  research  was  conducted  in  collaboration  with  S.  Abarbanel  (Tel  Aviv  University)  and  D.  Gottlieb 
(Brown  University). 
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MICHAEL  LEWIS 


Pattern  Search  Methods  for  Nonlinear  Optimization 

Pattern  search  methods  for  nonlinear  optimization  have  a  number  of  features  that  make  them  attractive 
for  use  in  engineering  optimization.  These  methods  are  ea^sy  to  imderstand  and  implement,  are  scalably 
parallel,  and  neither  require  nor  estimate  derivatives. 

We  have  developed  pattern  search  algorithms  for  general  nonlinearly  constrained  optimization  guaranteed 
to  possess  first-order  stationary  point  convergence.  We  are  presently  engaged  in  an  implementation  of  the 
new  classes  of  pattern  search  algorithms.  This  new  implementation  will  allow  us  to  investigate  various 
algorithmic  approaches,  as  well  as  opportunities  for  improved  computational  parallelism. 

Among  the  algorithmic  approaches  we  will  investigate  are  techniques  to  improve  scaUng  in  pattern  search 
algorithms  via  the  aggregation  of  similarly  scaled  design  variables.  We  will  also  investigate  opportunities  for 
algorithmic  steering  in  connection  with  pattern  search  algorithms. 

A  Posteriori  Finite  Element  Bounds  for  Sensitivity  Calculations 

In  the  optimization  of  systems  governed  by  differential  equations  one  would  like  to  use  to  coarsest  mesh 
possible  at  any  given  step  so  as  to  reduce  the  cost  of  the  optimization  iteration.  In  a  recent  series  of  papers, 
Patera,  Peraire,  and  their  collaborators  have  presented  an  a  posteriori  approach  to  computing  quantitative 
bounds  on  the  mesh  dependence  of  certain  functionals  of  the  solutions  of  differential  equations.  We  have 
begun  to  apply  these  ideas  in  the  context  of  optimization. 

We  have  developed  a  posteriori  bounds  for  sensitivities  of  output  linear  functionals  with  respect  to 
various  parameters  (such  as  coeiBScients)  in  boundary- value  problems.  Using  either  the  sensitivity  equations 
or  adjoint  equations  one  can  write  the  output’s  sensitivity  as  a  functional  of  the  solution  of  a  system  of 
differential  equations.  One  then  computes  bounds  on  the  error  in  the  sensitivities  on  a  coarse  grid  relative 
to  a  finer  grid.  Numerical  results  indicate  that  the  bounds  can  be  quite  good.  We  have  also  extended  the  a 
posteriori  bound  approach  to  certain  non-smooth  functionals. 

We  are  currently  investigating  extensions  of  the  boimd  procedure  to  more  complex  equations  and  output 
functionals.  We  are  also  implementing  an  approach  to  using  the  a  posteriori  bound  procedure  in  connection 
with  pattern  search  methods,  a  first  step  in  a  larger  investigation  of  using  approximate  function  values  with 
error  bounds  in  optimization. 

This  research  was  conducted  in  collaboration  with  Tony  Patera  and  Jaime  Peraire  (Massachusetts  In¬ 
stitute  of  Technology) . 

Analysis  of  Hessians  in  Parameter  Estimation  Problems  Governed  by  Differential  Equations 

Parameter  estimation  problems  in  systems  governed  by  differential  equations  arise  frequently  in  non¬ 
destructive  evaluation  and  materials  characterization.  Similar  problems  also  arise  in  design  optimization.  In 
both  cases  it  is  useful  to  imderstand  the  anal3rtical  nature  of  the  resulting  optimization  problems. 

We  have  completed  a  preliminary  study  of  the  Hessians  of  the  objective  for  a  class  of  optimization 
problems  that  arise  in  design  and  in  parameter  estimation.  We  have  estabhshed  how,  in  many  instances, 
the  Gauss-Newton  approximation  of  the  Hessian  may  prove  to  be  very  much  in  error  when  compared  to  the 
complete  Hessian. 

Future  work  includes  extending  the  analjrtical  approach  to  more  complex  equations,  and  further  investi¬ 
gation  of  the  consequences  of  this  analysis  for  numerical  computation,  particularly  for  quasi-Newton  updates 
and  preconditioning. 
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JOSIP  LONCARIC 


Spatial  Structure  of  Optimal  Flow  Control 

Designing  distributed  control  systems  begins  with  the  sensor/actuator  placement  problem.  While  in 
some  situations  discrete  search  of  combinatorial  complexity  seems  unavoidable,  continuum  problems  suggest 
solving  a  related  question.  If  one  could  sense  everything  and  actuate  everywherCf  what  should  one  do?  The 
answer  to  this  question  has  polynomial  complexity  (of  order  where  N  is  the  number  of  state  variables) 
and  can  serve  as  the  initial  effectiveness  filter  capable  of  rejecting  a  large  portion  of  the  design  search  space. 
This  favorable  situation  can  have  several  causes  depending  on  the  base  flow  pattern.  Our  aim  is  to  develop 
efiicient  numerical  procedures  to  solve  this  problem  for  flows  in  moderate  Reynolds  number  regimes. 

In  an  earher  work,  we  developed  a  rational  approximation  of  the  optimal  feedback  kernel  for  unsteady 
Stokes  flow.  For  the  flow  around  a  cyhnder,  this  approximation  was  proven  to  perform  within  0.026%  of  the 
exact  optimum  even  in  the  worst  case.  Using  the  vorticity  representation  in  conformally  mapped  geometries, 
this  approximation  is  decomposed  into  the  analytic  free  space  solution  and  a  boundary  term  which  can  be 
evaluated  numerically.  This  procedure  was  applied  to  the  NACA  0015  wing.  The  results  demonstrate  a 
significant  contribution  of  the  boimdary  to  the  control  effort. 

We  are  investigating  the  rational  approximation  as  an  additive  preconditioner  for  the  nonzero  base 
flow  case.  Since  local  dynamics  is  dominated  by  viscosity,  this  approximation  should  correctly  describe  the 
colocated  sensing/actuation  singularity  in  the  optimal  feedback  kernel.  As  a  first  step,  we  are  investigating 
a  shear  flow  where  the  Fourier  transform  in  the  streamwise  direction  can  be  used  to  simplify  the  problem. 
We  intend  to  compare  the  performance  and  quality  of  numerical  results  in  the  preconditioned  formulation 
with  those  obtained  directly.  The  insight  gained  in  this  study  will  provide  guidance  for  the  development  of 
numerical  schemes  for  the  full  NACA  0015  wing  case  at  moderate  Reynolds  number  flows. 

The  Coral  Project 

The  cost  of  developing  complex  computer  components  such  as  CPUs  has  become  so  high  that  scientific 
apphcations  alone  cannot  carry  the  full  burden.  In  the  future,  scientific  computing  will  have  to  use  mass 
market  leverage  to  overcome  the  cost  barrier.  A  cost-effective  alternative  to  high-end  supercomputing  was 
pioneered  by  Beowulf,  a  cluster  of  commodity  PCs.  By  now,  high  performance  Beowulf  clusters  can  be  built 
using  fast  commodity  PCs  and  switched  Fast  Ethernet.  We  want  to  explore  the  beneflts  and  the  hmitations 
of  this  approach,  based  on  apphcations  of  interest  to  ICASE. 

Based  on  the  available  performance  and  price  data,  we  created  a  hst  of  configurations  and  at  each 
price  level  selected  the  dominant  configuration.  After  a  discussion  of  various  apphcation  benchmarking 
requirements,  a  system  consisting  of  32  Pentium  II  400  MHz  nodes  and  a  dual  CPU  server  was  selected.  The 
system’s  aggregate  peak  performance  using  multiple  copies  of  the  ATLAS  benchmark  exceeds  10  Gflop/s, 
while  sustained  performance  on  CFD  apphcations  is  about  1.5  Gflop/s.  Our  benchmarks  show  perfect 
scahng  with  balanced  coarse  grained  paraUel  codes.  Fine  grained  codes  show  reasonably  good  scaling  with 
the  number  of  processors.  During  benchmarking,  we  discovered  and  resolved  a  performance  hmitation  of  the 
underlying  TCP  data  transport  protocol.  Coral  has  an  excellent  price/performance  ratio,  almost  an  order  of 
magnitude  better  than  an  equivalent  supercomputer.  This  conclusion  applies  primarily  to  balanced  coarse 
grained  apphcations  (e.g.,  domain  decomposition  codes). 

We  expect  to  refine  Coral’s  performance  through  further  benchmarking,  and  to  use  this  system  in  solving 
some  real  problems.  Since  the  cost  of  performance  is  rapidly  decreasing,  we  hope  to  enhance  and  expand 
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this  cluster  in  the  future. 

The  Coral  Project  was  initiated  by  Pi3aish  Mehrotra  and  Tom  Crockett.  Additional  benchmarking  was 
done  by  David  Keyes  and  Brian  Allan. 

DIMITRI  J.  MAVRIPLIS 

Large-scale  Unstructured  Mesh  Computations  Using  a  Parallel  Multigrid  Solver 

Unstructured  mesh  Navier-Stokes  solvers  oiFer  great  potential  for  reducing  the  turnarotmd  time  associ¬ 
ated  with  complex  geometry  aerodynamic  analysis.  For  accurate  computation  of  comphcated  aerodynamic 
flows,  very  high  resolution  grids  are  required.  Furthermore,  the  large  computational  overheads  associated 
with  unstructured  mesh  methods  require  the  use  of  efficient  solution  algorithms  which  can  be  ported  to 
massively  parallel  architectures.  The  purpose  of  this  work  is  to  demonstrate  the  feasibihty  of  performing 
very  large  scale  unstructured  mesh  computations  in  a  production  setting  using  existing  parallel  machines. 

A  low  memory,  rapidly  converging  unstructured  multigrid  algorithm  has  been  developed  and  ported  to 
parallel  computer  architectures.  The  flne  and  coarse  levels  of  the  unstructured  multigrid  algorithm  are  all 
partitioned  sequentially  before  being  distributed  on  the  target  parallel  machines.  Because  the  algorithm 
makes  use  of  implicit  line  solves,  the  partitioning  must  be  executed  in  such  a  way  that  the  implicit  hnes 
of  the  various  mesh  levels  are  not  intersected  by  processor  boimdaries.  This  is  achieved  by  contracting 
the  mesh  graph  along  the  implicit  lines  and  partitioning  the  contracted  (weighted)  graph  rather  than  the 
original  graph  of  the  mesh,  which  is  then  used  to  infer  the  final  mesh  partition  upon  de-contraction.  The 
communication  patterns  (which  remain  static  for  the  duration  of  the  analysis)  are  then  precomputed  and 
stored.  The  implementation  of  the  parallel  solver  is  based  on  the  MPI  communication  primitives. 

Good  scalability  of  the  unstructured  mesh  multigrid  solver  has  been  demonstrated  on  medium  size 
problems  involving  several  million  grid  points  on  both  a  CRAY  T3E-600,  using  up  to  512  processors,  and  an 
SGI  Origin  2000,  using  up  to  128  processors.  A  complete  high-lift  aircraft  geometry  case  has  been  solved  on 
a  grid  of  25  million  points  in  4.5  hours  on  512  processors  of  the  Cray  T3E.  The  same  case  has  also  been  run 
on  1450  processors  of  a  CRAY  T3E-1200E,  which  required  just  over  one  hour  of  compute  time. 

The  current  solver  has  also  been  benchmarked  on  the  ASCI  Red  and  ASCI  Blue-Pacific  parallel  com¬ 
puters,  illustrating  good  scalability  as  well  on  these  machines. 

Future  work  is  concentrated  on  enabhng  the  solution  of  even  larger  cases,  up  to  100  million  grid  points. 
This  will  require  the  parallehzation  of  all  preprocessing  operations  such  as  mesh  partitioning  and  coarse 
multigrid  level  construction.  This  effort  is  viewed  as  the  first  step  towards  developing  a  practical  large  eddy 
simulation  capabihty  for  aircraft  configurations. 

CHI- WANG  SHU 

High-order  Discontinuous  Galerkin  Method  and  WENO  Schemes 

Our  motivation  is  to  have  high-order  non-oscillatory  methods  for  structured  and  unstructured  mesh 
which  are  easy  to  implement  for  parallel  machines.  The  objective  is  to  develop  and  apply  high-order  dis¬ 
continuous  Galerkin  finite  element  methods  and  weighted  ENO  (WENO)  schemes  for  convection  dominated 
problems.  The  applications  will  be  problems  in  eieroacoustics  and  other  time-dependent  problems  with 
complicated  solution  structure. 

Jointly  with  Harold  Atkins  (NASA  Langley),  we  are  continuing  in  the  investigation  of  developing  the 
discontinuous  Galerkin  method  to  solve  the  convection-dominated  convection  diffusion  equations.  Emphasis 
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for  this  period  is  put  upon  studying  the  stability  and  accuracy  issues  involving  both  internal  and  domain 
boundary  conditions.  Discontinuous  Galerkin  method  for  2D  incompressible  flow  is  also  under  development 
jointly  with  Jian-Guo  Liu  (University  of  Maryland).  Jointly  with  Changqing  Hu  (Brown  University),  we 
have  been  pursuing  adaptive  methods  using  structured  and  unstructured  high-order  weighted  ENO  schemes. 
Preliminary  results  using  a  structured  WENO  code  on  the  double  Mach  reflection  problem  indicate  good 
resolution  and  a  saving  of  75%  in  terms  of  spatial  mesh  points  over  the  uniform  mesh  code. 

Research  will  be  continued  for  high-order  discontinuous  Galerkin  methods  and  weighted  ENO  methods 
and  their  applications. 

DAVID  SIDILKOVER 

Factorizable  Schemes  and  Essentially  Optimal  Multigrid  Solvers  for  the  Flow  Equations 

.The  main  objective  of  this  work  is  to  develop  discretization  schemes  that  facilitate  construction  of  the 
essentially  optimal  multigrid  solvers  for  the  equations  of  steady  compressible  flow.  Our  first  target  is  the 
Euler  equations  in  two  dimensions.  However,  the  methodology  being  developed  is  very  general.  It  can  be 
extended  to  Navier- Stokes  equations  and  to  three-dimensional  problems. 

A  factorizable  high-resolution  scheme  for  the  compressible  Euler  equations  has  been  constructed.  The 
factorizability  property  is  crucial  for  constructing  essentially  optimal  multigrid  solvers,  since  it  makes  it  pos¬ 
sible  to  distinguish  between  the  advection  and  full-potential  factors  of  the  system  on  the  level  of  the  discrete 
scheme.  The  key  ingredient  of  such  a  solver  is  a  relaxation  procedure  that  relies  on  the  auxiliary  potential 
and  stream-function  variables  and,  therefore,  utilizes  the  factorizability  property.  Another  important  impli¬ 
cation  of  the  factorizabihty  property  is  that  the  scheme  should  not  lose  accuracy  for  the  low  Mach  number 
flow.  The  proposed  approach  also  allows  the  combination  of  h-ellipticity  and  high-resolution  properties  in 
one  scheme. 

The  current  work  is  devoted  to  extending  the  scheme/solver  to  general  body-fitted  grids.  Extensions  of 
the  approach  to  viscous  and  three-dimensional  problems  are  in  progress  as  well. 

SEMYON  TSYNKOV 

Artificial  Boundary  Conditions  for  Aerodynamic  and  Aeroacoustic  Computations 

Many  typical  problems  in  aerodynamics  and  aeroacoustics,  including  those  that  present  immediate  prac¬ 
tical  interest,  e.g.,  flows  around  aircraft  and  problems  of  acoustic  radiation/propagation/scattering,  are 
formulated  on  infinite  domains.  It  is,  therefore,  obvious  that  any  numerical  methodology  for  solving  such 
problems  has  to  be  supplemented  (or,  rather,  preceded)  by  some  technique  that  would  lead  to  a  finite  dis¬ 
cretization.  Typically,  the  original  domain  is  truncated  prior  to  the  actual  discretization  and  numerical 
solution.  Subsequently,  one  can  construct  a  finite  discretization  on  the  new  bounded  computational  domain 
using  one  of  the  standard  techniques:  finite  differences,  finite  elements,  or  other.  However,  both  the  continu¬ 
ous  problem  on  the  truncated  domain  and  its  discrete  counterpart  wiU  be  subdefinite  unless  supplemented  by 
the  appropriate  closing  procedure  at  the  external  computational  boundary.  This  is  done  by  using  artificial 
boundary  conditions  (ABC’s);  the  word  “artificial”  emphasizing  here  that  these  boundary  conditions  are 
necessitated  by  numerics  and  do  not  come  from  the  original  physical  formulation. 

At  the  current  stage  of  the  aforementioned  project,  we  are  focusing  on  the  following  two  research  topics. 
First,  we  construct  highly  accurate  global  boundary  conditions  for  the  calculation  of  steady-state  flows  using 
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the  new  generation  of  advanced  factorizable  finite-difference  schemes  and  fast  multigrid  solvers.  For  the 
initial  test  cases  the  boundary  conditions  are  obtained  anal3d;ically  via  conformal  mappings;  at  later  stages 
we  will  employ  the  difference  potentials  method  which  has  already  demonstrated  excellent  performance  in  our 
previous  work.  The  new  schemes  themselves  have  already  led  to  a  multifold  reduction  in  the  solution  time 
(compared  to  the  standard  methods);  when  combined  with  the  advanced  external  boundary  conditions  that 
allow  for  an  order  of  magnitude  decrease  in  the  domain  size  without  loss  of  accuracy,  the  new  methodology 
may  potentially  result  in  more  than  two  orders  of  magnitude  overall  reduction  in  the  configuration  analysis 
cycle.  Second,  we  develop  the  exact  ABC’s  for  time-dependent  problems.  The  approach  is  based  on  exploiting 
the  weak  lacimae  in  numerical  solutions  of  the  wave-type  equations.  This  allows  effective  restriction  of  the 
temporal  nonlocahty  of  the  ABC’s,  otherwise  the  procedure  would  be  prohibitively  expensive.  We  have 
studied  the  lacunae  both  analytically  and  experimentally  and  have  already  calculated  the  solutions  to  some 
model  problems  for  the  wave  equation  using  the  new  ABC’s  methodology;  the  results  seem  very  promising. 
A  series  of  conference  and  joxurnal  papers  is  in  preparation  on  both  foregoing  subjects. 

Future  research  in  the  framework  of  this  project  will  primarily  concentrate  on  developing  the  unsteady 
ABC’s  algorithms  for  problems  in  acoustics,  including  the  advective  case,  and  electromagnetics. 

This  research  was  conducted  in  collaboration  with  V.  Ryaben’kii,  D.  Sidilkover,  S.  Abarbanel,  and  J. 
Nordstrom  (ICASE)  and  V.  Vatsa,  T.  Roberts,  C.  Swanson,  J.  Thomas,  and  H.  Atkins  (NASA  Langley). 
The  project  is  supported  by  the  Director’s  Discretionary  Fund. 


8 


PHYSICAL  SCIENCES,  FLUID  MECHANICS 


RICHARD  W.  BARNWELL 

Hyperbolic  Reynolds  Stress  Model  for  Turbulent  Boundary  Layers 

The  boundary  layer  equations  for  incompressible  mean  flow  with  the  turbulence  model  provided  by  the 
Reynolds  stress  equations  are  shown  to  be  hyperbolic  in  the  outer  region  where  convection  and  diffusion 
dominate.  Because  diffusion  is  of  inconsequential  magnitude  in  the  turbulent  interior,  it  can  be  either 
ignored  or  approximated  appropriately  there  so  that  the  governing  equations  are  hyperbolic  across  the  entire 
turbulent  part  of  the  boundary  layer.  Consequently,  hyperbolic  solution  techniques  can  be  used  to  advantage 
to  solve  the  turbulent  boundary  layer  as  Peter  Bradshaw  did  over  30  years  ago  with  a  more  approximate 
formulation.  The  hyperboHc  solutions  so  obtained  depend  on  conditions  immediately  upstream  of  the  solution 
point  and  may  give  a  better  representation  of  the  diverse  behavior  in  complex  three-dimensional  boundary 
layers  than  traditional  parabolic  solutions. 

Closure  assumptions  are  required  to  relate  the  diffusion  terms,  which  are  dominated  by  derivatives  of 
time-averaged  triple  products  of  the  fluctuating  velocity  components,  to  the  Reynolds  stresses.  The  tradi¬ 
tional  approach  is  to  replace  the  triple  products  with  terms  involving  derivatives  of  the  Reynolds  stresses  and 
solve  the  resulting  parabohc  problem.  Experimental  data  show  that  the  Reynolds  stresses  vary  algebraically 
with  distance  from  the  mean  boundary  layer  edge  in  the  outer  region  where  convection  and  diffusion  dom¬ 
inate,  and  an  asymptotic  analysis  shows  that  such  functions  satisfy  a  differential  equation  which  renders 
the  traditional  differential  representations  of  the  triple  products  equivalent  to  the  algebraic  representation 
developed  by  Bradshaw.  The  result  is  a  set  of  hyperbohc  governing  equations  with  fewer  modeling  constants 
than  the  corresponding  parabolic  set.  In  the  hyperbolic  approach  the  additional  data  are  provided  by  initial 
conditions.  The  hyperbolic  stress  model  is  used  to  explain  why  the  lateral  spreading  rate  of  a  turbulent 
wedge  in  a  laminar  boimdary  layer  is  so  much  larger  than  the  vertical  boundary  layer  growth. 

The  next  task  is  to  compare  the  results  of  this  method  to  those  of  other  methods  and  experimental  data. 

SANG-HYON  CHU 

Development  of  Microwave- driven  Smart  Material  Actuator 

“Wireless”  control  of  actuators  with  microwave  offer  tremendous  advantages  over  hard- wired  actuators, 
especially  for  space  applications  such  as  the  Next  Generation  Space  Telescope  (NGST),  in  which  thousands 
of  discrete  actuators  are  required  to  affect  high  precision  distributed  shape- control  of  the  primary  reflector. 
This  new  concept  alleviates  the  need  for  hard-wired  connections  resulting  in  significantly  simpler  system 
designs  and  lower  system  mass. 

3x3  rectenna  patches  built  at  JPL  were  tested  in  an  anechoic  chamber  by  modulating  microwave  power 
level,  frequency,  incidence  angle,  and  polarization  angle.  The  PZT  5A  multilayer  piezoelectric  actuator 
was  selected  as  the  smart  actuator  and  tested  under  a  direct  couphng  with  a  3x3  rectenna.  The  obtained 
experimental  results  indicate  that  the  multilayer  piezoelectric  actuator  can  be  successfully  utihzed  with  a 
wide  degree  of  controllability  when  the  3x3  patch  rectenna  converts  microwave  energy  to  DC  power  that,  in 
turn,  drives  the  actuator. 

The  nature  of  dispersion  of  microwave  might  cause  energy  loss  during  transmission.  The  concept  of 
power  allocation  and  distribution  will  be  considered  for  this  reason.  Logic  circuits  embedded  in  rectennas 
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will  control  power  collection  and  allocation  to  feed  DC  power  to  any  actuator  where  optical  correction  is 
necessary.  A  prototype  of  a  power  distribution  circuit  will  be  fabricated  and  improved  to  meet  all  required 
characteristics  in  the  future. 

AYODEJI  DEMUREN 

Streamwise  Vorticity  Generation  in  Jets 

Experiments  have  shown  that  three-dimensional  jets  can  be  used  to  enhance  mixing  and  entrainment 
rates  in  comparison  to  axisymmetric  jets.  A  fundamental  understanding  of  the  dynamics  of  complex,  tur¬ 
bulent  jets  is  required  for  their  prediction  and  control.  Understanding  of  the  evolution  of  the  streamwise 
vorticity  fields  is  essential.  Experiments  have  used  streamwise  and  azimuthal  vorticity  dynamics  to  explain 
the  presence  or  absence  of  axis-switching  in  experimental  measurements  of  3:1  aspect  ratio  rectangular  jets 
with  different  initial  conditions.  This  study  showed  that  the  presence  of  streamwise  vorticity  pairs  with  out¬ 
flow  rotation  (pumping  fluid  from  the  core  to  the  ambient  perpendicular  to  the  major  axis  plane)  produced 
axis  switching  while  pairs  with  the  opposite  sense  of  rotation  did  not.  However,  in  jets  with  no  streamwise 
vorticity  at  discharge,  some  other  mechanism  must  originate  it. 

Generation  mechanisms  are  investigated  via  Reynolds-averaged  (RANS),  large-eddy  (LES)  and  direct 
numerical  (DNS)  simulations  of  laminar  and  turbulent  rectangular  jets.  Complex  vortex  interactions  are 
found  in  DNS  of  laminar  jets,  but  axis-switching  is  observed  only  when  a  single  instability  mode  is  present  in 
the  incoming  mixing  layer.  With  several  modes  present,  the  structure  is  not  coherent  and  no  axis-switching 
occurs.  RANS  computations  also  produce  no  axis-switching.  On  the  other  hand,  LES  of  high  Reynolds 
number  turbulent  jets  produce  axis-switching  even  for  cases  with  several  instability  modes  in  the  mixing 
layer.  Analysis  of  the  source  terms  of  the  mean  streamwise  vorticity  equation  through  post-processing  of  the 
instantaneous  results  shows  that  a  complex  interaction  of  gradients  of  the  normal  and  shear  Reynolds  stresses 
is  responsible  for  the  generation  of  streamwise  vorticity  which  leads  to  axis-switching.  RANS  computations 
confirm  these  results.  K  —  e  turbulence  model  computations  fail  to  reproduce  the  phenomenon,  whereas 
algebraic  Reynolds  stress  model  (ASM)  model  computations  in  which  the  secondary  normal  and  shear  stresses 
are  computed  exphcitly  succeeded  in  reproducing  the  phenomenon  accurately. 

More  quantitative  comparisons  to  experimental  data  are  planned. 

SHARATH  S.  GIRIMAJI 

Pressure-strain  Correlation  Modeling:  Testing  and  Validation 

At  the  second  moment  closure  level,  accurate  modeling  of  turbulent  flows  is  contingent  upon  accurate 
modehng  of  the  pressure-strain  correlation  term.  Development  of  pressure-strain  correlation  models  valid 
for  complex  flows  is  the  objective  of  this  project. 

We  have  entered  the  final  stages  of  vahdating  and  fine-tuning  of  the  model.  After  successful  validation  in 
a  variety  of  benchmark  problems,  more  subtle  issues  on  the  manner  of  interpolation  between  extreme  states 
are  being  addressed.  While  matched  asymptotic  expansion  techniques  are  theoretically  sound,  they  appear 
to  lead  to  very  complex  model  forms.  Other  avenues  are  being  explored. 

Further  testing  and  systematic  development  for  intermediate  states  of  turbulence  will  come  next. 
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Rotating  Turbulent  Flows 

The  effect  of  rotation  on  turbulence  still  remains  an  enigma  in  many  practical  flow  situations.  Our 
objective  in  to  understand  the  behavior  of  irrotational  fluctuations  in  a  flow  with  strong  mean-flow  rotation. 

While  the  effect  of  rapid  rotation  on  rotational  fluctuations  is  well  described  by  the  Taylor-Proudman 
theorem,  the  behavior  of  irrotational  fluctuations  is  not  well  known.  We  demonstrate  that  the  Navier-Stokes 
equations  permit  a  large  family  of  irrotational  solutions  in  two-dimensional  or  rapid  rotation  flmits.  This 
could  lead  to  improved  insight  into  the  behavior  of  turbulence  in  rotating  flows. 

The  importance  of  irrotational  fluctuations  in  turbulence  needs  to  be  further  expounded. 

This  research  was  conducted  in  collaboration  with  J.R,  Ristorcelh  (Los  Alamos  National  Laboratory). 

Non-equilibrium  Algebraic  Reynolds  Stress  Modeling 

Computationally  viable,  yet  physically  accurate  turbulence  models  are  needed  for  large-scale,  practical 
flow  computations.  We  develop  such  a  model  starting  from  the  physically  sophisticated  but  computationally 
expensive  second-order  closure. 

The  theory  of  complex  dynamical  systems  is  being  studied  to  develop  new  reduction  procedures.  A 
scheme  based  on  minimization  of  evolution  potential  was  developed  and  is  currently  undergoing  close  scrutiny. 
This  appears  to  offer  important  advantages  over  previous  methods.  A  simple  procedure  for  identifying  the 
slow  (master)  variables  is  developed. 

Extensive  testing  of  the  slow  variable  selection  criterion  and  the  reduction  procedure  would  come  next. 

C.E.  GROSCH 

Simulation  of  Supersonic  Jet  Mixing  by  Tabs  in  Lobe  Ejectors 

Mixing  enhancement  of  high-  and  low-speed  streams  is  utihzed  as  a  means  to  improve  efficiency  of 
supersonic  combustors,  reduce  aircraft  signatures,  and  control  high-speed  jet  noise.  One  common  method  of 
mixing  enhancement  is  to  use  lobe  mixer  ejectors.  Another  is  to  place  tabs  on  the  edges  of  the  jets.  In  the 
main,  experimental  studies  are  available  to  evaluate  the  performance  and  guide  the  design  of  these  mixers. 
The  objective  of  this  research  is  to  use  numerical  simulation  to  examine  the  performance  of  lobe  ejectors, 
both  with  and  without  tabs,  in  order  to  understand  the  physics  of  the  mixing  and  how  it  is  affected  by 
changes  in  the  parameters  of  these  devices. 

A  set  of  numerical  calculations  are  carried  out  using  the  compressible,  three-dimensional,  time-dependent 
Navier-Stokes  equations.  Tabs  are  modeled  by  pairs  of  counter  rotating  vortices.  Various  geometric  configu¬ 
rations  of  the  lobe  mixers  are  simulated  with  periodic  side  boundary  conditions  to  simulate  an  array  of  these 
devices. 

The  simulations  of  the  lobe  mixer  without  tabs  show  that  the  jet  becomes  imstable  and  oscillates  in 
the  “garden  hose”  mode.  For  a  particular  lobe  geometry  and  velocity  ratio,  the  oscillation  has  a  constant, 
narrow  band,  frequency  near  the  inflow.  Further  downstream  the  amplitude  grows  and  the  motion  becomes 
nonlinear  leading  to  spectral  broadening.  Typical  Strouhal  numbers  of  the  narrow  band  oscillation  is  about 
0.45.  The  physics  of  this  phenomena  is  related  generation  of  streamwise  vorticity  at  the  edges  of  the  jet.  As 
the  disturbances  become  nonlinear,  rapid  mixing  between  the  supersonic  and  subsonic  jets  occurs  and,  by 
about  halfway  down  the  channel,  the  jet  and  coflow  become  nearly  fully  mixed.  A  set  of  simulations  of  the 
same  geometry  with  tabs  has  begun.  The  results  of  the  first  of  these  has  been  partially  analyzed. 
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Future  experimental  and  numerical  studies  are  required  to  more  clearly  define  the  initial  induced  vorticity 
field  in  the  round  jet.  It  is  hoped  that  future  experiments  will  use  PIV  imaging  to  measure  the  cross-stream 
vectors.  The  experimental  data  could  then  be  used  to  set  the  inflow  conditions  for  the  simulations. 

Further  calculations  are  planned  for  the  lobe  mixer  including  varying  the  geometry  and  varying  the 
placement  of  the  tabs  on  the  sides  of  the  lobes. 

ROGER  HART 

Flow  Diagnostics  Using  Laser-induced  Thermal  Acoustics 

The  non-intrusive  optical  measurement  of  gas-phase  parameters  such  as  temperature,  flow  velocity,  and 
pressure  is  of  considerable  utihty  in  understanding  the  airflow  around  a  test  body  in  a  wind  tuimel.  Laser- 
induced  thermal  acoustics  (LITA)  is  a  relatively  new  optical  diagnostic  method  that  has  great  promise  for 
becoming  a  practical,  accurate  flow  characterization  tool.  Two  laser  pulses  are  employed  in  LITA.  The 
first  pulse  creates  a  pair  of  counterpropagating  acoustic  wavepackets.  The  second  pulse  is  diffracted  by  the 
wavepackets  onto  a  detector.  Analysis  of  the  various  features  of  the  LITA  waveform  allows  the  determination 
of  the  speed  of  sound  in  the  medium  (and  thus  the  bulk  temperature),  one  or  more  components  of  the  flow 
velocity,  and  the  density  or  pressure.  Advantages  of  LITA  as  compared  to  other,  better-developed  diagnostics 
are:  LITA  allows  seedless  velocimetry;  LITA  measurements  take  only  about  one  microsecond,  giving  the 
potential  for  very  high  repetition  rates  for  the  study  of  turbulent  flows;  and  LITA  gives  excellent  1%) 
single-shot  accuracy  and  precision.  The  goal  of  the  current  work  is  to  completely  understand  the  physics  of 
the  LITA  measurement  process  and  to  embody  that  understanding  in  a  quantitative  model  which  has  been 
carefully  validated  against  laboratory  experiments. 

The  fundamental  optical  and  acoustical  mechanisms  of  LITA  are  well  understood;  nevertheless,  com¬ 
bining  these  to  create  a  model  that  can  accurately  and  robustly  duphcate  the  results  of  well-controlled 
experiments  has  involved  considerable  effort.  On  the  experimental  side,  we  currently  make  measurements  in 
calibrated  air  flows  using  standard  laboratory  style  lasers  and  optics,  as  this  allows  the  greatest  flexibihty 
and  control,  though  thought  is  being  given  to  simplifying  and  hardening  the  equipment  for  use  in  production 
wind  tunnels.  Modehng  currently  combines  fairly  simple  models  for  low-amphtude  (linear  regime)  sound 
waves  and  standard  optical  diffraction  theory.  One  recent  accomphshment  was  learning  how  to  correctly 
include  the  effect  of  the  finite  size  of  the  acoustic  wavepackets  on  the  decay  rate  of  the  LITA  waveform. 
The  decay  of  the  signal  limits  the  precision  of  measurements  of  temperature  and  velocity,  and  the  rate  of 
the  decay  is  a  critical  piece  of  information  for  determining  pressure,  so  this  is  of  some  importance  for  the 
apphcation  of  LITA. 

The  major  unresolved  modeling  issue  involves  explaining  certain  systematic  differences  observed  among 
the  decay  rates  of  the  three  spectral  components  that  make  up  the  LITA  signal.  An  additional  series  of 
experiments  is  being  considered  to  help  constrain  our  modehng  efforts.  This  research  was  conducted  in 
collaboration  with  R.J.  Balia  and  G.C.  Herring  (NASA  Langley). 

LI-SHI  LUO 

Lattice  Boltzmann  Scheme  for  Flow-structure  Interaction 

One  important  problem  in  the  applications  of  the  lattice  Boltzmann  equation  to  various  flow  problems 
is  the  interaction  between  fluid  flow  and  sohd  boundaries,  i.e.,  the  implementation  of  boundary  conditions 
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in  fluid-structure  interfaces.  The  moving  boundary  problem  in  high  Reynolds  flow  poses  a  challenge  to 
traditional  CFD  methods.  Usually,  turbulence  modeling  has  to  be  employed  in  such  cases.  The  present 
work  uses  the  method  of  the  lattice  Boltzmann  equation  (LBE)  to  simulate  the  flow-structure  interaction 
problem. 

With  the  LBE  method,  boundary  conditions  for  objects  with  complicated  geometries  are  easy  to  imple¬ 
ment.  We  intend  to  implement  a  computationally  efficient  boundary  condition  for  moving  boundaries  in  flows 
with  high  Reynolds  number.  Various  schemes  combining  existing  bounce-back  type  boimdary  conditions  with 
interpolation  (or  extrapolation)  are  under  theoretical  study  and  numerical  test. 

A  paper  entitled  “An  Accurate  Curved  Boundary  Treatment  in  the  Lattice  Boltzmann  Method,”  au¬ 
thored  by  Renwei  Mei,  Li-Shi  Luo  (ICASE),  and  Wei  Shyy  has  been  submitted  to  the  Journal  of  Computa¬ 
tional  Physics.  Currently  we  are  working  on  boundary  conditions  for  a  moving  boundary. 

The  present  work  has  been  funded  by  NASA  Langley  Research  Center  under  the  program  of  “Innovative 
Algorithms  for  Aerospace  Engineering  Analysis  and  Optimization.”  The  Co-PPs  of  the  proposal  for  the 
present  work  are  Renwei  Mei  (UFL),  Li-Shi  Luo,  and  Wei  Shyy  (UFL).  The  collaboration  also  includes 
Pierre  Lallemand  (Director,  ASCI-CNRS,  Univ.  Paris-Sud),  and  Dominique  d’Humieres  (ENS,  Paris). 

Lattice  Boltzmann  Model  for  Non-ideal  Gases 

The  key  issues  in  the  study  of  multi-phase  (e.g.,  liquid- vapor)  flows  are  the  modeling  of  interfaces  and 
phase  transition  among  different  phases.  It  is  difficult  to  use  the  Navier-Stokes  equations  to  model  the 
inhomogeneous  multi-phase  flows  because  the  interfacial  tracking  is  a  laborious  computation.  In  the  past 
few  years,  a  number  of  lattice  Boltzmann  models  have  been  developed  to  model  multi-phase  flows.  However, 
the’ multi-phase  lattice  Boltzmann  equation  is  still  lacking  a  rigorous  theoretical  basis.  For  instance,  previous 
multi-phase  lattice  Boltzmann  models  do  not  have  a  consistent  equilibrium  thermodynamics.  The  present 
work  applies  the  Enskog  theory  of  hard  spheres  to  revise  the  theory  of  the  multi-phase  lattice  Boltzmann 
equation. 

With  the  Enskog  theory  we  were  be  able  to  derive  a  new  multi-phase  lattice  Boltzmann  model  which  has 
a  consistent  equifibrium  thermodynamics.  We  have  rigorously  demonstrated  the  deficiencies  in  the  previous 
multi-phase  lattice  Boltzmann  models  and  provided  a  systematic  procedure  to  derive  a  correct  multi-phase 
lattice  Boltzmann  model  based  upon  the  Enskog  theory  (or  the  revised  Enskog  theory).  A  brief  account 
of  the  present  work  has  been  published  in  Physical  Review  Letters  and  as  an  ICASE  report.  An  extended 
version  of  the  work  has  been  submitted  to  Physical  Review  E  and  a  corresponding  ICASE  report  is  in 
preparation. 

We  intend  to  derive  a  thermodynamically  consistent  multi-component  lattice  Boltzmann  model  in  the 
future  based  upon  the  same  methodology. 

ALEX  POVITSKY 

Computation  of  Three-dimensional  Acoustic  Fields 

Our  goal  is  to  improve  parallelization  efficiency  of  sets  of  linear  banded  systems  which  represent  a  core 
part  of  implicit  and  compact  solvers.  To  use  processors  for  other  tasks  while  they  axe  idle  from  recursive 
algebraic  computations,  we  run  processors  by  a  schedule  rather  than  by  communication.  This  schedule  is 
generated  before  CFD  computations  are  executed. 
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To  improve  parallelization  efficiency,  we  combined  our  Immediate  Backward  Pipelined  Gaussian  Elimi¬ 
nation  (IB-PTA)  with  the  known  Two-Way  Pipehned  Gaussian  Elimination  (TW-PTA)  to  obtain  the  Imme¬ 
diate  Backward  Two-Way  Pipelined  Thomas  Algorithm  (IBTW-PTA).  To  generate  the  processor  schedule, 
we  use  recursive  algorithm  for  the  row  of  first  P/2  processors  as  described  in  our  ICASE  Report  and  make 
symmetric  reflection  of  this  schedule  for  the  last  P/2  processors.  Then  we  include  exchange  of  the  forward- 
step  coefficients  between  the  (P/2)*^  and  (P/2H- 1)^^  processors  and  solution  of  2x2  system.  These  tasks  are 
performed  immediately  after  completing  the  forward-step  computations  for  each  group  of  lines  on  middle 
processors.  Measurements  on  CRAY-T3E  show  an  advantage  of  the  proposed  algorithm  over  the  standard 
PTA,  the  TW-PTA  and  the  IB-PTA  for  8  and  16  processors-in-row.  Reduction  of  processor  idle  time  and 
large  optimal  size  of  the  pocket  of  lines  (low  communication  latency  time)  ensure  low  parallelization  penalty 
of  the  proposed  algorithm. 

We  are  working  on  implementation  of  this  algorithm  to* a  3D  aeroacoustic  solver  (with  P.  Morris); 
implementation  of  processor  schedule  for  multigrid  line  solvers  (with  B.  Diskin);  for  the  front-type  solvers 
where  grid  lines  are  data-dependent;  and  for  the  multi-zone  solvers  where  a  processor  might  handle  pieces 
of  different  grids. 

C.-C.  ROSSOW 

Investigation  of  the  Properties  of  the  MAPS  Flux  Splitting  Scheme 

Several  efforts  have  been  focused  on  the  development  of  discretization  methods  that  combine  the  accu¬ 
racy  of  flux-difference  splittings  in  capturing  of  shear  layers  with  the  robustness  of  flux  vector  splittings  in 
capturing  strong  shock  waves.  One  recent  contribution  to  this  class  of  hybrid  flux  splittings  is  the  MAPS 
(Mach  number  based  Advection  Pressure  Sphtting)  scheme.  Significant  features  of  the  MAPS  scheme  are 
its  simplicity,  its  robustness,  and  the  fact  that  no  entropy  condition  is  required.  Further  research  revealed 
that  the  scheme  is  very  similar  to  the  Roe  flux-difference  splitting,  with  the  exception  that  no  intermediate 
state  needs  to  be  computed.  It  was  foimd  that  in  the  original  MAPS  formulation  only  the  compressible 
terms  of  the  Roe-scheme  are  retained.  Including  the  incompressible  terms  of  the  Roe-scheme  into  the  MAPS 
formulation  extended  MAPS  to  incompressible  flows.  In  the  research  to  be  conducted,  the  connection  of 
the  MAPS  discretization  with  the  Roe-scheme  shall  be  further  exploited.  On  the  one  hand,  a  better  un¬ 
derstanding  of  the  terms  necessary  for  low  Mach  number  preconditioning  is  sought.  On  the  other  hand, 
research  will  be  directed  towards  convergence  acceleration  by  implicit  methods.  For  implicit  schemes,  the 
flux  Jacobians  need  to  be  evaluated,  which  is  well  established  for  the  Roe-scheme.  Due  to  the  similarity 
of  MAPS  and  Roe  discretization,  it  is  expected  that  simplifications  to  the  implicit  operators  can  be  made. 
This  may  be  essential  for  unstructured  methods  where  the  directional  techniques  from  structured  codes  for 
imphcit  residual  smoothing  cannot  be  applied  straightforwardly,  but  in  3D  fully-implicit  methods  are  still 
prohibitive  due  to  storage  requirements. 

The  first  area  of  research  is  the  formulation  of  a  consistent  preconditioning  matrix.  An  analysis  of 
the  Roe-scheme  revealed  that  in  the  incompressible  limit  pressure  terms  dominate  the  artificial  dissipation. 
These  pressure  terms  are  scaled  by  the  inverse  of  the  speed  of  sound.  In  order  to  remove  the  stiffness  at 
incompressible  flows,  the  speed  of  sound  in  these  terms  is  artificially  reduced,  thus  making  these  terms 
even  more  dominant.  It  appeared  logical  that  these  artificially  increased  pressure  differences  have  to  be 
balanced  by  properly  scaled,  artificially  introduced  time-derivatives  of  pressure.  Adding  these  pressure 
time- derivatives  to  the  equations  written  in  strong  conservation  form  leads  to  a  preconditioning  matrix 
being  identical  to  the  Choi/Merkle  preconditioning  in  the  incompressible  hmit.  However,  due  to  the  proper 
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scaling,  for  compressible  flows  the  nonpreconditioned  compressible  equations  are  recovered,  a  feature  not 
shared  by  the  original  Choi/Merkle  preconditioner. 

In  the  second  area  of  research,  the  possibility  of  exploiting  the  MAPS  formulation  for  an  acceleration 
technique  similar  to  implicit  residual  smoothing  will  be  investigated.  In  the  MAPS  formulation,  advective  and 
pressure  terms  appear  separately.  Using  an  implicit,  scalar  smoothing  for  the  advective  terms  in  each  equation 
while  treating  the  pressure  terms  explicitly  as  a  source  term  may  result  in  an  acceleration  technique  similar 
to  the  directional  smoothing  well  estabhshed  in  structured  methods.  However,  the  directional  dependence 
will  be  avoided,  making  the  technique  feasible  for  unstructured  methods.  Depending  on  the  results  obtained 
with  this  simphfied  implicit  scheme,  it  may  be  intended  to  incorporate  the  MAPS  discretization  into  a  fully 
implicit  formulation. 

ROBERT  RUBINSTEIN 

Shock  Wave  Propagation  in  Weakly  Ionized  Gases 

It  has  been  proposed  that  the  mechanism  responsible  for  anomalous  properties  of  shock  waves  in  weakly 
ionized  gases  could  be  identified  by  measuring  the  relaxation  time  of  these  properties  following  extinction 
of  the  plasma  source,  and  matching  it  to  the  relaxation  times  of  the  nonequilibrium  phenomena  known  to 
exist  in  weakly  ionized  gases.  When  these  relaxation  times  cannot  be  measured  directly,  they  are  inferred 
theoretically,  usually  by  assuming  relaxation  from  a  state  nearly  in  thermal  equilibrium.  This  proposal 
therefore  requires  the  understanding  of  relaxation  from  a  steady  state  far  from  thermal  equilibrium.  In  order 
that  the  matching  be  unambiguous,  the  relaxation  rates  in  the  weakly  ionized  gas  must  be  known  precisely. 

We  investigated  this  relaxation  for  two  typical  problems:  the  relaxation  of  a  steady  state  described 
by  a  power-law  distribution  function,  and  the  relaxation  of  a  non-equilibrium  steady  state  in  a  gas  of 
hght  particles  diffusing  in  a  gas  of  heavy  particles.  In  both  examples,  it  is  found  that  relaxation  is  much 
slower  than  relaxation  from  a  near-equilibrium  state.  The  explanation  is  that  if  the  Boltzmann  equation  is 
satisfied  away  from  the  momentum  space  sources  and  sinks  which  maintain  the  non-equilibrium  steady  state, 
relaxation  to  thermal  equilibrium  requires  that  the  effects  of  extinguishing  the  sources  and  sinks  diffuse  over 
all  of  momentum  space.  This  relaxation  can  be  very  slow.  We  conclude  that  the  relaxation  times  in  a  non- 
equihbrium  weakly  ionized  gas  may  be  evaluated  incorrectly  if  exponential  relaxation  from  a  near-equilibrium 
state  is  assumed.  A  correct  calculation  will  require  a  more  detailed  molecular  model  of  the  weakly  ionized 
gas,  at  the  level  of  a  Boltzmann  equation  at  least. 

The  pressure  fields  produced  in  the  regions  of  unbalanced  charge  ahead  and  behind  the  shock  have  been 
proposed  as  sources  of  increased  sound  speed  and  anomalous  shock  properties.  The  possibility  of  non-ideal 
gas  corrections  to  the  equation  of  state  due  to  large  electrostatic  forces  will  be  investigated  next. 

This  research  was  conducted  in  collaboration  with  A.H.  Auslender  (NASA  Langley). 

Boundary  Layer  Receptivity  in  the  Presence  of  Random  Surface  Roughness  and  Acoustic  Excitation 

There  is  still  no  well-established  procedure  for  incorporating  transition  in  turbulence  calculations.  While 
aerodynamic  flows  can  be  computed  successfully  using  any  of  several  different  turbulence  models  if  the 
transition  location  is  prescribed  in  advance,  no  single  turbulence  model  can  reliably  predict  transition.  If 
transition  is  computed  incorrectly,  the  entire  flow  calculation  is  generally  unsatisfactory. 

As  part  of  a  larger  program  of  integrating  transition  and  turbulence  models,  the  first  stage  of  transition, 
boundary  layer  receptivity,  is  being  considered  from  a  probabilistic  viewpoint.  The  analysis  allows  both 
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the  surface  roughness  distribution  and  the  acoustic  excitation,  which  combine  to  excite  ToUmien-Schlichting 
waves,  to  vary  randomly.  A  simple  stochastic  differential  equation  is  being  investigated  as  a  model  of 
this  process.  Prehminary  Monte  Carlo  simulations  demonstrate  the  effect  of  random  surface  roughness  in 
enhancing  receptivity. 

The  analysis  will  be  extended  to  permit  prediction  of  the  probability  density  function  of  receptivity 
amplitude  as  a  function  of  downstream  position.  The  calculation  will  also  be  extended  to  more  realistic 
models  of  receptivity,  including  downstream  variabihty  of  the  growth  rate  of  ToUmien-Schlichting  waves. 

This  research  was  conducted  in  coUaboration  with  S.S.  Girimaji  (ICASE)  and  C.L.  Streett  (NASA 
Langley). 

Theory  of  Rotating  and  Stratified  Turbulence 

The  theory  of  weak  turbulence  describes  the  inertial  range  structure  of  rotating  turbulence,  considered 
as  a  system  of  interacting  inertial  waves.  It  is  natural  to  ask  whether  a  similar  description  of  the  dissipation 
range  is  possible  when  wave  effects  persist  into  the  dissipation  range.  This  analysis  is  also  motivated  by 
the  known  anisotropy  of  energy  transfer  in  rotating  turbulence:  unlike  non-rotating  turbulence,  in  which 
energy  is  transferred  from  larger  to  smaUer  scales  of  motion,  in  rotating  turbulence,  energy  is  simultaneously 
transferred  to  the  plane  perpendicular  to  the  rotation  axis. 

It  is  known  that  dissipation  range  interactions  are  between  modes  with  nearly  coUinear  wavevectors.  It 
is  shown  that  the  dispersion  relation  of  inertial  waves  permits  such  interactions  to  be  resonant  only  when 
the  wavevectors  are  nearly  perpendicular  to  the  rotation  axis.  Accordingly,  the  dissipation  range  in  strongly 
rotating  turbulence  is  concentrated  near  this  wavevector  plane.  Since  inertial  range  interactions  transfer 
energy  into  this  region,  it  is  plausible  that  the  dissipation  range  should  be  concentrated  near  it. 

This  research  was  conducted  in  coUaboration  with  Ye  Zhou  (ICASE  and  Tuskegee  University). 

NAIL  YAMALEEV 

A  High-order  Accurate  Method  on  a  Moving  Grid  Adapted  to  the  Solution 

It  is  known  that  the  attainment  of  high-order  accuracy  for  problems  with  shocks  is  problematic,  since  a 
first-order  error  introduced  by  the  shock-capturing  procedure  can  persist  globally  downstream.  One  of  the 
most  effective  ways  to  reduce  this  error  is  to  diminish  the  grid  spacing  in  the  shock  region  alone  rather  than 
refine  the  grid  in  the  entire  computational  domain.  The  main  purpose  of  the  present  work  is  to  elaborate 
a  high-order  accurate  shock-capturing  scheme  on  a  moving  grid  d3mamicaUy  adapted  to  the  solution,  that 
enables  one  to  increase  the  resolution  of  high  gradients  as  weU  as  improve  the  accuracy  of  the  solution  in 
smooth  flow  regions. 

High-order  Unear  and  nonUnear  shock-capturing  schemes  are  used  to  solve  the  2D  unsteady  Euler  equa¬ 
tions  written  in  general  curviUnear  coordinates.  For  the  Unear  shock-capturing  scheme,  the  interpolation 
set  for  the  approximation  of  the  solution  is  fixed  as  a  function  of  grid  location.  For  the  nonUnear  scheme, 
the  solution  is  represented  by  using  a  high-order  accurate  polynomial  reconstruction,  so  that  the  adaptive 
stencils  employed  in  the  high-order  spatial  operator  are  biased  towards  the  smoothest  information  available. 
To  generate  a  grid  including  such  important  properties  as  smoothness,  orthogonaUty  and  adaptation  simul¬ 
taneously,  the  variational  approach  proposed  by  Brackbill  and  Saltzman  is  employed.  Since  the  Jacobian  of 
transformation  depends  on  the  temporal  coordinate,  the  geometric  conservation  law  originally  introduced  by 
Thomas  and  Lombard  must  be  satisfied.  Then  the  geometric  conservation  law  equation  is  solved  numerically 
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along  with  the  flow  conservation  law  using  the  same  conservative  diflFerence  operators  as  those  employed  for 
approximating  the  governing  equations.  The  high-order  accurate  flow  solver  and  the  adaptive  grid  generator 
have  been  implemented.  We  are  currently  joining  these  codes  so  that  the  geometric  conservation  law  is 
satisfied  automatically  at  each  time  step. 

We  plan  to  apply  the  present  method  to  calculate  both  steady  and  essentially  unsteady  flows  with  shocks. 

This  research  was  conducted  in  collaboration  with  M.  Carpenter  and  J.  Thomas  (NASA  Langley). 

YE  ZHOU 

On  Higher-order  Dynamics  in  Lattice-based  Models  Using  Chapman-Enskog  Method 

Compared  to  traditional  methods  in  computational  fiuid  dynamics  (CFD),  the  lattice-based  models  are 
simple  and  easy  to  implement  on  computers.  The  advantages  and  disadvantages  of  the  original  lattice  gas 
automata  (LGA)  have  been  well  documented.  The  lattice  Boltzmann  equations  (LBE)  were  later  introduced 
to  remove  some  of  the  drawbacks.  A  further  simplification  to  the  LBE  is  achieved  using  the  BGK  procedure 
(LBGK).  Indeed,  it  is  well  established  that  the  Navier-Stokes  equation  can  be  deduced  at  low-order  expansion 
of  Chapman-Enskog  expansion.  Many  authors  further  asserted  that  the  Burnett-like  equation  could  be 
obtained  by  performing  higher-order  using  Chapman-Enskog  expansion.  The  motivation  of  this  work  is  to 
carry  out  these  higher-order  Chapman-Enskog  expansions  to  investigate  whether  it  is  consistent  to  do  so. 

We  found  that  two  conditions  determine  whether  the  lattice-based  models  could  or  could  not  have 
higher-order  dynamics  when  classical  Chapman-Enskog  expansion  is  used.  These  conditions  are  a  number  of 
conservation  laws  and  the  space  and  time  discretization.  The  pure  diffusion  model,  a  system  with  only  one 
conserved  quantity,  is  first  presented  to  illustrate  that  the  higher-order  dynamics  is  allowed.  We  then  turned 
our  attention  to  the  lattice-based  hydrodynamics  equations.  After  noting  the  feature  of  non-commutative 
cross  time  derivative,  we  demonstrate  how  Burnett-like  equations  could  be  obtained  for  lattice-based  hydro¬ 
dynamics  models  using  the  classic  Chapman-Enskog  expansion  method. 

The  results  reported  in  this  work  can  be  used  to  analyze  theoretically  systems  where  hydrodynamic 
description  may  break  down,  a  typical  example  is  simulations  of  the  micro-electronic  mechanical  systems 
(MEMS). 

This  research  was  conducted  in  collaboration  with  Y.H.  Qian  (Columbia  University). 
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COMPUTER  SCIENCE 


PO-SHU  CHEN 

Parallel  Solution  of  Coupled  Aeroelastic  Problems 

The  accurate  prediction  of  aeroelastic  response  is  essential  in  the  design  of  high  performance  aircraft. 
It  requires  solving  the  coupled  fluid  and  structure  equations  simultaneously.  The  objectives  of  this  research 
are  to  investigate  a  variety  of  different  approaches  for  solving  aeroelastic  problems,  to  establish  a  proper 
module  between  structure  and  fluid  simulations,  to  solve  the  aeroelastic  response,  and  to  research  a  better 
integration  algorithm  for  communication  between  fluid  and  structure  equations, 

A  new  package,  Load  and  Motion  Transfer  (LMT),  has  been  developed  to  be  a  ‘bridge’  between  CFD 
and  FEM  software  for  aeroelastic  simulation.  It  is  capable  of  interpolating  the  initial  nodal  coordinates  of 
the  fluid  mesh  from  the  structure  nodal  displacement,  and  to  integrate  the  structure  nodal  force  from  the 
fluid  pressure.  It  is  superior  to  the  FASIT  code,  currently  being  used  by  the  MDO  branch  of  NASA  Langley, 
in  terms  of  flexibihty,  accuracy,  and  user-friendhness. 

Since  the  reliable  transfer  program  is  available,  the  next  stage  is  a  simple  static  aeroelastic  problem. 
The  fluid  research  code  developed  by  Dimitri  Mavriplis  and  structure  research  code  developed  by  Charbel 
Farhat  have  been  selected  for  this  purpose.  However,  the  package  here  is  capable  of  solving  the  steady  state 
of  the  aeroelastic  problems  only.  It  cannot  solve  real  time-dependent  problems,  hke  vibration.  The  second 
stage  of  improvements  is  to  consider  the  proper  approach  for  the  heavy  communication  aeroelastic  package. 
The  details  of  the  approach  are  still  in  discussion. 

This  research  was  conducted  in  collaboration  with  Tom  Zang  and  Anthony  Giunta  (NASA  Langley), 
Dimitri  Mavriplis  (ICASE),  and  Charbel  Farhat  (University  of  Colorado). 

THOMAS  W.  CROCKETT 

Porting  PGL  to  Beowulf-class  PC  Clusters 

The  development  of  low-cost  computational  clusters  ba.sed  on  commodity  processors  and  networking 
components  has  become  an  important  new  trend  in  parallel  computing.  Many  organizations  have  installed 
such  systems,  and  many  more  are  planning  to  do  so.  In  the  near  future,  Beowulf-class  clusters  could  become 
the  platform  of  choice  for  many  challenging  scientific  and  engineering  computations.  To  derive  maximum 
benefit  from  these  systems,  users  will  need  the  same  tools  and  capabilities  that  have  been  developed  for  use 
on  proprietary  parallel  computing  systems. 

One  of  these  tools  is  the  PGL  rendering  system,  developed  at  ICASE  to  provide  runtime  visualiza¬ 
tion  support  for  parallel  applications.  PGL  currently  runs  on  half  a  dozen  different  MPP  systems.  We 
have  recently  been  working  to  develop  a  version  for  Linux-based  PC  clusters,  using  ICASE’s  Coral  system, 
a  Beowulf-class  cluster,  as  a  development  platform.  Although  we  expected  this  to  be  straightforward,  a 
number  of  problems  have  arisen  involving  compilers  and  low-level  communication  layers.  Consequently,  a 
substantial  portion  of  our  effort  has  involved  installing  and  testing  compilers,  message  passing  libraries,  net¬ 
work  interfaces,  and  job  schedulers.  Serial  and  parallel  versions  of  PGL  are  now  running  on  the  Coral  cluster, 
using  400  MHz  Pentium  II  processors  and  Fast  Ethernet  communication  hardware.  Serial  performance  on 
a  benchmark  suite  is  good,  ranging  from  70-107%  of  a  300  MHz  Sun  UltraSPARC  II  and  39-71%  of  a  250 
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MHz  MIPS  RIOOOO.  Parallel  performance  results  are  awaiting  resolution  of  problems  with  Coral’s  network 
interfaces. 

When  testing  and  performance  evaluation  of  the  Linux/PC  version  of  PGL  is  completed  on  Coral,  we 
plan  to  release  it  to  NASA’s  HPCCP/CAS  community  as  part  of  PGL  1.2.  Longer  term  plans  include 
additional  testing  and  algorithmic  modifications  for  distributed  shared  memory  architectures  such  as  the 
SGI  0rigin2000  and  HP  Exemplar,  where  scalability  has  so  far  been  poor.  We  also  have  plans  to  incorporate 
additional  functionality  in  PGL,  and  to  develop  improved  user  interfaces  for  interactive  applications. 

Application  of  Parallel  and  Distributed  Computing  to  Visualization  and  Data  Assimilation  Problems  in  the 
Atmospheric  Sciences 

To  implement  the  Vice  President’s  vision  of  a  Digital  Earth,  vast  quantities  of  data  from  disparate 
sources  must  be  integrated  into  an  intuitive,  accessible  representation.  NASA’s  Earth  Science  Enterprise 
sees. Digital  Earth  as  a  promising  framework  for  making  much  of  its  remote  sensing  data  available  to  the 
scientific  community  and  the  general  public.  To  implement  the  Digital  Earth  concept,  many  technologies 
will  need  to  be  brought  to  bear,  among  them  visualization,  networking,  and  high-performance  computing. 

We  are  exploring  the  potential  for  parallel  and  distributed  computing  and  visualization  techniques  to 
contribute  to  the  data  processing  and  data  assimilation  requirements  of  Digital  Earth.  We  have  used  ICASE’s 
PGL  rendering  system  to  develop  a  prototype  visualization  application  which  combines  a  medium-resolution 
(9  km)  elevation  model  of  the  Earth  with  a  true-color  surface  map,  including  support  for  several  different 
map  projections.  Preliminary  performance  tests  have  been  conducted  on  Langley’s  16-processor  0rigin2000 
system,  and  on  a  network  of  Sun  UltraSPARC  workstations  at  ICASE.  Although  rendering  performance  is 
good,  the  results  suggest  that  multi-resolution  data  representations  and  additional  graphics  functionality 
(such  as  triangle  strips  and  more  aggressive  clipping  algorithms),  in  addition  to  higher  processor  counts,  are 
needed  to  deliver  interactive  performance  with  models  of  this  size  (18.7  million  triangles).  User  interfaces 
which  are  tailored  to  the  application  will  also  be  required,  and  we  have  begun  evaluating  Java  for  this  purpose. 
In  related  activities,  we  served  on  Langley’s  Digital  Earth  Planning  Team,  and  continued  participating  in 
meetings  of  the  federal  Inter-agency  Digital  Earth  Working  Group. 

We  plan  to  combine  atmospheric  data  from  Langley’s  LITE  experiment  with  the  digital  terrain  model 
described  above  to  produce  an  interactive  tool  for  visuahzing  vertical  structure  in  the  atmosphere.  The 
ultimate  goal  is  to  develop  a  responsive,  user-friendly  system  which  will  combine  atmospheric  data  from 
a  variety  of  sources  to  obtain  a  better  understanding  of  the  physical  processes  involved.  We  also  want  to 
investigate  approaches  for  incorporating  much  larger  terrain  models,  such  as  USGS’s  30-arc-second  global 
elevation  dataset  (933  miUion  grid  points).  We  hope  that  the  techniques  developed  will  lead  us  toward 
Digital  Earth’s  goal  of  providing  interactive  access  to  multi-petabyte  datasets,  a  challenge  which  is  beyond 
the  capabihty  of  current  computing  technology. 

DAVID  E.  KEYES 

Parallel  Implicit  Solvers  for  Simulation  of  Multiscale  Phenomena 

The  development  and  application  of  parallel  implicit  solvers  for  multiscale  phenomena  governed  by  PDEs 
are  our  chief  objectives.  Newton-Krylov-Schwarz  (NKS)  methods  have  proven  to  be  broadly  applicable, 
architecturally  versatile,  and  tunable  for  high  performance  on  today’s  high-end  commercial  parallel  platforms 
(e.g.,  Cray  T3E,  SGI  Origin,  IBM  SP).  Both  structured-grid  and  unstructured-grid  CFD  legacy  codes  have 
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been  ported  to  such  platforms  and  reasonable  objectives  for  algorithmic  convergence  rate,  parallel  efficiency, 
and  raw  floating  point  performance  have  been  met.  However,  architectural  challenges  have  increased  on 
the  next  generation  of  high-end  machines,  as  represented,  for  instance,  by  the  ASCI  “blue”  machines  at 
Lawrence  Livermore  and  Los  Alamos  National  Laboratories,  and  also  on  Beowulf  clusters,  such  as  ICASE’s 
Coral.  Our  primary  efforts  are  concentrated  on  algorithmic  adaptations  of  NKS  methodology  appropriate 
for  the  emerging  architectures  and  on  evaluation  of  new  software  tools  and  methodology  to  get  the  most 
performance  out  of  them. 

The  general  approach  embodied  in  the  NKS  family  of  algorithms  is  documented  in  previous  ICASE 
technical  reports,  among  other  places.  Specific  emphases  in  the  most  recent  reporting  period  include  en¬ 
hanced  per-node  floating  point  performance,  multilevel  preconditioning,  optimization,  and  evaluation  of  NKS 
applications  on  the  ICASE  Beowulf  system. 

Per-node  floating  point  performance  has  been  a  source  of  major  consternation  for  users  (and  apologists) 
of  high-end  machines.  Anecdotal  evidence,  such  as  a  list  of  recent  “Bell  Prize”  peak  performance  winners, 
indicates  that  sparse,  grid-based  computations  do  not  stack  up  very  competitively  against  other  scientific 
simulations.  We  have  shown  that  attention  to  cache  line  reuse  in  the  organization  and  ordering  of  grid-based 
data  that  is  iteratively  dragged  up  and  down  the  memory  system  in  a  typical  PDE  code  can  make  an  order  of 
magnitude  difference  in  execution  time,  apart  from  paralleUsm,  and  an  experimental  program  to  study  this 
effect  via  hardware  event  counters  is  on-going.  Our  ultimate  aims  are  to  apply  formal  optimization  techniques 
to  the  layout  of  program  data  for  optimal  register  and  cache  residency,  to  prepare  for  “Processors-in-Memory” 
(PIM)  programming  that  vendors  have  announced  in  future  products,  and  to  evaluate  the  algorithmic  utility 
of  multivector  forms  of  sparse  algorithms  with  better  cached  matrix  reuse. 

Single-level  Schwarz  preconditioning  is  sufficient  for  many  purposes,  especially  unsteady  or  pseudo-time 
continuation  applications.  However,  we  have  recently  demonstrated  on  some  highly  nonlinear  radiation 
transport  applications  that  2-level  Schwarz  methods,  with  a  coarse  level  that  is  removed  from  the  fine  grid 
by  many  powers  of  two  in  density,  is  not  only  superior  in  convergence  but  can  be  somewhat  superior  in 
overall  execution  time,  in  spite  of  the  global  coordination  required. 

Optimization  is  usually  the  real  goal  of  computational  simulation  capability.  As  a  goal  unto  itself,  parallel 
optimization  is  much  studied,  but  optimization  subject  to  a  high-dimensional  set  of  equality  constraints 
coming  from  a  discretized  PDE  is  a  situation  in  which  the  tail  wags  the  dog.  Following  the  leads  of  O.  Ghattas 
and  D.P.  Young  in  this  area,  we  are  exploring  the  utility  of  the  NKS  “rootfinder”  as  a  Lagrange-NKS 
optimizer. 

In  terms  of  peak  performance,  the  ICASE  Beowulf  cluster  is  cost-effective  hardware,  but  the  software 
environment  is  co-critical.  In  tests  of  the  same  Euler  benchmark  used  on  the  ASCI  machines,  we  have 
shown  (see  the  Coral  webpages)  that  the  Portland  Group  compilers  are  particularly  effective  on  native 
non-cache-optimized  code,  with  tiniprocessor  running  times  that  beat  the  ASCI  processors  and  also  the 
same  400  MHz  Pentium  II  with  NT  compilers.  For  cache-optimized  code,  the  RIOOOO  and  Power2  are  still 
somewhat  superior,  but  the  per-node  performance  of  Coral  is  almost  competitive,  independent  of  economic 
considerations. 

We  will  continue  to  develop  NKS  methods  in  implicit  parallel  CFD,  examining  a  variety  of  algorithmic, 
programming  paradigm,  and  architectural  issues.  We  will  also  increase  the  complexity  of  the  models  in  our 
NKS  radiation  transport  work,  in  accordance  with  the  ASCI  project  roadmap. 

This  research  was  conducted  in  collaboration  with  W.  Kyle  Anderson  (NASA  Langley),  Dana  Knoll  (Los 
Alamos  National  Laboratory),  Dinesh  Kaushik,  Nilan  Karunaratne,  and  Xin  He  (Old  Dominion  University), 
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and  Satish  Balay,  William  D.  Gropp,  Lois  C.  Mclnnes,  and  Barry  F.  Smith  (Argonne  National  Laboratory). 

GERALD  LUTTGEN 

Statecharts  via  Process  Algebra 

Statecharts  is  a  visual  language  for  specifying  synchronous  reactive  systems  which  is  popular  among 
software  engineers,  despite  the  complexity  of  its  step  semantics.  It  extends  finite-state  machines  by  concepts 
of  concurrency,  hierarchy,  and  priority.  Most  Statecharts  variants  do  not  have  a  compositional  semantics 
and,  thereby,  prohibit  the  reuse  of  specifications  of  systems’  components.  The  reason  for  this  prohibition 
is  the  subtle  interplay  between  micro  and  macro  steps,  as  imposed  by  Statecharts’  synchrony  hypothesis 
and  the  principle  of  causality.  The  focus  of  this  research  is  to  develop  a  compositional  process-algebraic 
framework  which  is  expressive  enough  to  embed  several  Statecharts  variants. 

The  process  algebra  that  has  been  developed  is  inspired  by  timed  process  languages  and  unifies  the 
principles  of  Statecharts  semantics,  such  as  concurrency,  causality,  and  synchrony.  It  represents  macro  steps 
as  sequences  of  micro  steps  which  are  enclosed  by  clock  ticks.  The  benefits  of  the  this  approach  include  the 
estabhshment  of  a  compositional  framework  (1)  which  is  suitable  for  embedding  several  Statecharts  variants, 
(2)  which  is  intuitive  and  simple  since  causal  orderings  are  not  encoded  in  transition  labels,  (3)  which  can  be 
equipped  with  behavioral  equivalences  carried  over  from  traditional  process  algebras,  and  (4)  which  allows 
for  interfacing  Statecharts  to  verification  tools. 

In  the  future,  we  hope  to  apply  the  insights  between  clock  semantics  and  Statecharts  semantics  obtained 
during  this  research  to  develop  a  Statecharts  variant  which  is  suitable  for  specifying  distributed  reactive 
systems. 

This  research  was  conducted  in  collaboration  with  Ranee  Cleaveland  (SUNY  at  Stony  Brook)  and  Michael 
von  der  Beeck  (TU  Munich). 

Applying  Model  Checking  Tools  to  the  Verification  of  Flight  Guidance  Systems 

Mode  confusion  is  one  of  the  most  serious  problems  in  aviation  safety.  Today’s  digital  flight  decks  are  too 
complex  in  order  for  pilots  to  be  aware  of  the  actual  states  -  or  modes  -  of  all  systems.  A  year  ago,  NASA 
Langley  started  an  initiative  to  analyze  the  mode  logic  of  a  flight-guidance  system  to  uncover  weaknesses  in 
its  design  which  may  lead  to  mode  confusion.  For  this  purpose,  the  mode  logic  was  modeled  as  a  finite  state 
machine,  and  the  theorem  prover  PVS  was  used  to  reason  about  the  system.  The  objective  of  this  research  is 
to  investigate  whether  model  checking  techniques  -  i.e.,  sophisticated,  automated  state-exploration  methods 
-  are  able  to  achieve  this  task  “better”  than  theorem  proving. 

In  this  light,  the  mode  logic  is  modeled  and  analyzed  by  using  three  popular  model-checking  tools:  Mur0, 
SMV,  and  Spin.  In  general,  all  three  tools  are  able  to  handle  the  task  fairly  well  and  promise  to  scale  up. 
The  modehng  is  most  elegant  in  Mur^  and  SMV  since  their  specification  languages  match  the  characteristics 
of  the  mode  logic  as  a  modular,  synchronous  system.  Murk’s  rich  language  even  allows  for  carrying  over  the 
PVS  specification  of  the  mode  logic  one-to-one,  and  its  ability  to  specify  and  verify  invariants  enables  the 
efficient  verification  of  many  properties  related  to  mode  confusion.  For  the  latter,  however,  the  temporal 
logic  CTL  “  as  employed  in  SMV  -  is  more  practical  due  to  its  flexibifity  to  reason  about  system  paths  rather 
than  system  states.  Moreover,  SMV’s  model  checker,  which  is  based  on  Binary  Decision  Diagrams,  is  faster 
than  the  other  tools  and  outperforms  PVS  by  returning  verification  results  instantly.  Finally,  diagnostic 
information  generated  by  each  of  the  three  tools  is  as  adequate  as  the  information  obtained  when  using  PVS. 
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In  the  future,  we  will  model  larger  parts  of  the  digital  flight  deck,  such  that  the  model-checking  techniques 
may  investigate  more  complex  problems  related  to  mode  confusion. 

This  research  was  conducted  in  collaboration  with  Victor  Carreno  (NASA  Langley). 

KWAN-LIU  MA 

Image  Graphs  -  A  Novel  Approach  to  Visual  Data  Exploration 

Effort  spent  generating  and  collecting  data  is  wasted  unless  there  are  effective  means  to  organize  and 
understand  this  data.  This  fact  poses  a  problem  in  some  modern  visualization  research.  For  example,  in 
volume  rendering  the  current  data  handling  and  visualization  technology  cannot  handle  the  sheer  size  of 
emerging  datasets.  While  various  efforts  have  been  made  to  condense  datasets  and  accelerate  rendering 
calculations,  little  work  has  been  done  to  coherently  represent  the  process  and  results  of  this  type  of  visu¬ 
alization.  However,  this  information  about  the  data  exploration  is  knowledge  that  should  be  shared  and 
reused.  The  objective  of  this  research  is  to  develop  a  mechanism  which  not  only  offers  a  representation  of 
this  knowledge  but  also  serves  as  an  interface  for  visual  data  exploration. 

We  use  a  graph-based  approach  to  represent  not  only  the  restdts  but  also  the  process  of  data  visual¬ 
ization.  Each  node  in  the  graph  consists  of  an  image  and  the  corresponding  visualization  parameters  used 
to  produce  it.  Each  edge  in  the  graph  shows  the  change  in  rendering  parameters  between  the  two  nodes  it 
connects.  We,  thus,  call  this  design  image  graphs.  Image  graphs  are  not  just  static  representations  since 
users  can  interact  with  a  graph  to  review  a  previous  visualization  session  or  to  perform  new  rendering.  In 
particular,  operations  which  cause  changes  in  rendering  parameters  can  propagate  through  the  graph.  Image 
graphs  help  streamline  the  process  of  visual  data  exploration  in  two  ways.  First,  the  graphs  give  the  user 
a  representation  of  the  relationship  between  the  visualization  parameter  changes  and  the  images  produced 
using  them.  Often  these  relationships  are  not  obvious  just  through  inspection  of  the  rendered  images.  An 
understanding  of  how  specific  rendering  parameter  changes  will  affect  the  image  output  is  important  because 
it  reduces  the  number  of  images  the  user  must  produce  to  find  parameters  which  yield  a  useful  image,  and 
these  images  can  be  quite  time  consuming  to  produce.  Second,  the  dynamic  features  of  the  graphs,  such  as 
annotation  and  automatic  pruning,  facilitate  collaboration  and  animation.  They  also  help  speed  the  search 
for  good  rendering  parameters  by  allowing  users  to  perform  operations  on  groups  of  nodes.  These  opera¬ 
tions  include  simple  modification  of  rendering  parameters,  combination  of  nodes  to  form  “child”  nodes  with 
their  properties,  and  propagation  of  modifications  through  the  graph.  We  have  implemented  a  web-based 
volume  visualization  system  which  uses  the  image  graph  design  for  the  purpose  of  supporting  remote  and 
collaborative  visualization. 

We  are  presently  designing  a  comprehensive  user  study  to  understand  the  extent  to  which  the  image 
graphs  can  be  shared  and  reused,  and  to  refine  the  design  of  the  visualization  system  we  have  built  and  its 
image  graph  interface.  Furthermore,  we  think  image  graphs  would  be  useful  for  any  type  of  data  exploration 
problem  which  produces  images  of  data  as  a  function  of  some  set  of  parameters.  Therefore,  in  addition 
to  volume  visualization,  other  possible  applications  include  radiosity  calculations,  2D  image  filtering,  and 
polygon-based  rendering.  Our  future  work  includes  demonstrating  that  our  approach  is  indeed  useful  for 
these  other  problem  domains. 
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PIYUSH  MEHROTRA 


Arcade:  A  Distributed  Computing  Environment  for  ICASE 

Distributed  heterogeneous  computing  is  being  increasingly  applied  to  a  variety  of  large-size  computa¬ 
tional  problems.  Such  computations,  for  example,  the  multidisciplinary  design  optimization  of  an  aircraft, 
generally  consists  of  multiple  heterogeneous  modules  interacting  with  each  other  to  solve  the  problem  at 
hand;  Such  applications  are  generally  developed  by  a  team  in  which  each  disciphne  is  the  responsibihty  of 
experts  in  the  field.  The  objective  of  this  project  is  to  develop  a  GUI-based  environment  which  supports  the 
multi-user  design  of  such  applications  and  their  execution  and  monitoring  in  a  heterogeneous  environment 
consisting  of  a  network  of  workstations,  specialized  machines,  and  parallel  architectures. 

We  have  been  implementing  a  Java-based  three-tier  prototype  system  which  supports  a  thin  cHent 
interface  for  the  design  and  execution  of  multi- module  codes.  The  middle  tier  consists  of  logic  to  process  the 
user  input  and  also  to  manage  the  resource  controllers  which  comprise  the  third  tier.  In  the  last  few  months 
we  have  focused  on  the  issue  of  resource  discovery  and  monitoring.  In  particular,  we  have  implemented 
an  add-on  module  to  manage  the  resources  based  on  the  JINI  technology  developed  by  Sun  for  resource 
management.  JINI  allows  independent  resources  to  announce  their  presence  and  current  status  to  a  central 
server.  This  module  provides  a  chent  interface  which  allows  the  user  to  monitor  the  current  status  of  the 
resources.  One  of  the  issues  with  JINI  is  that  it  uses  the  multicast  protocol  for  its  discovery  and  join 
processes.  Such  protocols  do  not  work  over  subnets  or  across  domains.  We  have  designed  a  hierarchical 
implementation  of  servers  which  allows  the  resources  to  announce  their  presence  across  the  whole  Arcade 
environment  even  if  it  spans  multiple  domains.  Similarly,  it  allows  the  resource  allocation  module  to  query 
the  status  of  resources  across  the  whole  environment. 

We  continue  to  develop  the  system  adding  other  features  such  as  support  for  conducting  parameter 
studies.  We  also  intend  to  expand  the  kind  of  modules  that  can  be  used  by  Arcade,  in  particular  providing 
support  for  CORBA-based  components. 

This  research  was  conducted  in  collaboration  with  K.  Maly,  A.  Al-Theneyan,  and  M.  Zubair  (Old 
Dominion  University). 

Languages  for  High  Performance  and  Distributed  Computing 

There  are  many  approaches  to  exploiting  the  power  of  parallel  and  distributed  computers.  Under  this 
project,  our  focus  is  to  evaluate  these  different  approaches,  proposing  extensions  and  new  compilation  tech¬ 
niques  where  appropriate. 

Recently  a  proposal  was  put  forth  for  a  set  of  language  extensions  to  Fortran  and  C  based  upon  a 
fork-join  model  of  parallel  execution;  called  OpenMP,  it  aims  to  provide  a  portable  shared  memory  program¬ 
ming  interface  for  shared  memory  and  low  latency  systems.  However,  these  extensions  ignore  the  issue  of 
data  locality  which  becomes  a  performance  issue  on  shared  address  space  machines  which  use  a  physically 
distributed  memory  system.  We  have  proposed  a  set  of  OpenMP  extensions  to  allow  users  to  express  the 
distribution  of  the  data  structures  in  a  manner  similar  to  the  one  used  in  HPF.  We  are  currently  in  the 
process  of  implementing  these  extensions  in  order  to  study  their  efficacy. 

We  have  also  continued  our  study  on  the  apphcabihty  of  HPF  to  a  series  of  codes  using  semi-structured 
grids  ranging  from  multiblock,  semi-coarsening  multigrid,  and  structured  AMR  algorithms.  We  have  exam¬ 
ined  a  range  of  data  distribution  strategies  for  these  algorithms  and  have  tried  to  characterize  the  situations 
under  which  each  of  these  strategies  would  produce  the  best  results. 
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OPUS,  a  language  jointly  developed  by  ICASE  and  University  of  Vienna,  provides  high-level  support 
for  programming  multimodule  applications.  In  the  last  few  months  we  have  redesigned  and  reimplemented 
the  Opus  runtime  system.  We  have  also  completed  the  compiler  front-end  necessary  for  translating  Opus 
programs  to  target  the  runtime  system.  This  translator  has  been  implemented  using  the  Vienna  Fortran 
Compiler  System  of  the  University  of  Vienna.  The  system  allows  users  to  translate  and  execute  Opus 
programs  across  a  network  of  workstations.  We  are  in  the  process  of  evaluating  our  design  and  enhancing  it 
to  incorporate  support  for  distributed  processing  within  Opus  modules. 

This  research  was  conducted  in  collaboration  with  B.  Chapman  (University  of  Houston),  Erwin  Laure 
(University  of  Vieima),  and  H.  Zima  (University  of  Vienna). 

ALEX  POTHEN 

Parallel  Algorithms  for  Incomplete  Factorization  Preconditioners 

The  parallel  computation  of  incomplete  factorization  (ILU)  preconditioners  for  solving  large  systems  of 
equations  has,  until  recently,  remained  an  elusive  goal.  We  propose  to  develop  new  algorithmic  approaches 
that  avoid  the  serial  bottlenecks  that  have  plagued  existing  algorithms,  to  implement  these  algorithms,  and 
to  identify  applications  where  these  preconditioners  are  effective. 

The  new  algorithm  is  based  on  a  characterization  of  the  fill  (zero  elements  in  the  coefficient  matrix 
becoming  nonzero  during  the  factorization)  in  terms  of  paths  in  the  adjacency  graph  associated  with  the 
coefficient  matrix.  We  assume  that  the  adjacency  graph  can  be  partitioned  into  subgraphs  of  roughly  equal 
sizes  such  that  few  edges  are  cut  by  the  partition.  We  map  the  subgraphs  to  processors,  form  a  subdomain 
interconnection  graph,  and  order  the  subdomains  so  as  to  reduce  global  dependences.  On  each  subdomain, 
we  locally  reorder  the  interior  vertices  before  the  boundary  vertices.  This  reordering  limits  the  fill  that  joins 
a  subgraph  on  one  processor  to  a  subgraph  on  another,  and  enhances  the  concurrency  in  the  computation. 
The  preconditioner  computation  takes  places  in  two  phases:  in  the  first  phase,  each  processor  computes  the 
rows  of  the  preconditioner  corresponding  to  the  interior  vertices  of  their  subdomains.  In  the  second  phase, 
the  rows  corresponding  to  the  boundary  nodes  are  computed. 

Our  preliminary  results  on  the  SGI  Origin  show  efficiencies  greater  than  75%  on  up  to  16  processors. 
We  are  continuing  to  develop  our  parallel  implementation,  and  are  incorporating  new  algorithms  that  we 
have  designed  for  efficient  serial  computation  of  preconditioners. 

This  research  was  conducted  in  collaboration  with  David  Hysom  (Old  Dominion  University  and  ICASE). 

Spindle:  An  Algorithmic  Laboratory  for  Ordering  Algorithms 

We  have  begun  to  work  on  an  algorithmic  laboratory  for  quickly  prototyping  promising  algorithms  and 
experimenting  with  a  collection  of  algorithmic  variants  for  several  ordering  problems.  Among  these  are  the 
fill  reduction  problem:  Order  the  rows  and  columns  of  the  coefficient  matrix  to  reduce  the  fill  in  sparse 
Gaussian  efimination  (both  complete  and  incomplete  factorizations);  and  the  sequencing  problem:  Given 
a  set  of  elements,  and  pairs  of  elements  that  are  related,  order  the  elements  such  that  related  elements 
are  numbered  consecutively.  We  employ  object-oriented  design  techniques  (OOD)  to  make  the  laboratory 
flexible  and  easy  to  extend. 

OOD  manages  complexity  by  means  of  decomposition  and  abstraction.  We  decompose  our  software 
into  two  main  types  of  objects:  structural  objects  corresponding  to  data  structures,  and  algorithmic  objects 
corresponding  to  algorithms.  This  design  decouples  data  structures  from  algorithms,  permitting  a  user  to 
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experiment  with  different  algorithms  and  different  data  structures,  and  if  necessary  develop  new  algorithms 
and  data  structures.  We  have  implemented  seven  variants  from  the  family  of  minimum  degree  ordering 
algorithms  using  this  design  paradigm.  Some  of  these  algorithms  were  developed  only  in  the  past  few  years, 
and  prior  to  our  work,  there  was  no  single  code  that  implemented  all  of  these  algorithms.  Our  implementation 
makes  it  possible  for  us  to  change  ordering  algorithms  midstream  while  ordering  a  problem.  We  have  found 
this  to  be  of  benefit,  since  a  hybrid  algorithm  that  employs  the  multiple  minimum  degree  (MMD)  algorithm 
and  switches  at  later  stages  to  the  approximate  minimum  degree  (AMD)  algorithm  can  improve  performance 
for  problems  where  either  algorithm  has  poor  performance.  These  ordering  algorithms  are  quite  sophisticated, 
and  their  performance  on  various  problem  classes  is  poorly  understood.  Our  algorithmic  laboratory  enhances 
our  understanding  of  these  issues  since  encapsulation  makes  it  possible  to  examine  the  state  of  the  objects 
in  our  code  during  execution. 

We  have  also  implemented  wavefront-reducing  algorithms — such  as  the  Cuthill-McKee  and  Sloan  order¬ 
ing  algorithms — in  our  library.  Spindle,  our  code,  is  available  as  a  stand-alone  program  and  with  an  interface 
to  Matlab. 

This  research  was  conducted  in  collaboration  with  Gary  Kumfert  (Old  Dominion  University  and  ICASE). 

KEVIN  P.  ROE 

Parallelization  of  a  Multigrid  Incompressible  Viscous  Cavity  Flow  Solver  Using  OpenMP 

Effective  use  of  parallel  machines  requires  easily  maintainable  and  portable  programming  models  that 
allow  users  to  exploit  parallelism  in  applications  written  in  a  standard  high-level  language.  MPI  provides 
portabihty,  however  it  can  be  more  difficult  to  maintain  and  is  not  a  high-level  programming  model.  High 
Performance  Fortran  (HPF)  is  portable  and  fairly  easy  to  maintain.  OpenMP  is  also  portable  on  shared 
memory  architectures  and  fairly  easy  to  maintain,  although  it  can  only  be  used  on  shared  memory  machines. 
OpenMP  has  some  advantages  over  HPF  and  MPI  when  one  is  using  a  shared  memory  machine.  Such 
as  allowing  the  user  to  incrementally  parallelize  their  code.  Another  benefit  is  that  when  the  number  of 
processors  is  changed  that  data  residing  in  memory  does  not  have  to  be  reshaped. 

To  evaluate  OpenMP ’s  capabilities,  we  examine  a  two-dimensional  multigrid  incompressible  viscous 
flow  solver.  This  solver,  originally  written  to  be  run  sequentially,  only  required  one  major  change.  The 
Symmetric  Gauss  Seidel  (SGS)  algorithm  that  was  originally  used  had  to  be  replaced  because  its  red-black 
parallel  version  was  numerically  unstable.  Since  we  were  more  interested  in  testing  OpenMP’s  capabilities, 
a  simple  parallel  Jacobi  algorithm  was  substituted  in  its  place.  Results  of  the  code’s  parallelization  using 
OpenMP  on  the  SGI  0rigin2000  at  NASA  Ames  were  promising.  Parallel  efiiciencies  were  in  the  90-100% 
range  for  four  processors  on  a  problem  size  of  512x512.  Tests  using  a  different  number  of  processors  for  each 
grid  level  at  runtime  were  also  conducted.  We  were  able  to  reduce  the  overhead  associated  with  using  too 
many  processors  on  a  small  problem  size  by  specifying  the  number  of  threads  (and  hence  processors)  for 
each  grid  level  at  runtime. 

We  are  still  investigating  where  the  loss  in  efficiency  is  occurring;  we  believe  that  larger  problem  sizes 
will  yield  better  parallel  efficiencies  when  more  processors  are  utihzed.  We  will  also  examine  a  mechanism 
for  determining  the  ideal  number  of  processors  to  utilize  for  each  grid  level  at  runtime. 

This  research  was  conducted  in  collaboration  with  Piyush  Mehrotra  (ICASE) . 
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LINDA  STALS 


Solution  Techniques  for  Radiation  Transport  Equations 

When  modeling  radiation  transport,  a  system  of  three  nonlinear  time-dependent  equations  is  often  used. 
Due  to  the  behavior  of  the  nonlinearities,  this  system  is  computationally  expensive  to  solve.  We  are  studying 
the  use  of  two  different  approaches  to  reduce  the  solution  time,  namely,  the  use  of  better  solution  techniques, 
such  as  multigrid  methods,  and  the  use  of  parallel  machines. 

As  a  preliminary  study  of  the  radiation  transport  equations,  we  have  considered  the  special  case  where 
all  energies  are  in  equilibrium.  In  such  a  case,  the  system  of  equations  can  be  reduced  to  a  single  equation. 
This  single  equation  is  interesting  in  its  own  right  as  it  contains  strong  nonlinearities  and  large  jumps  in 
the  coefficients.  The  results  and  lessons  learned  in  the  study  of  this  single  equation  will  be  used  when  we 
implement  the  system  of  three  equations. 

The  discretization  technique  we  used  was  the  finite  element  method  with  piecewise  linear  basis  elements. 
We  are  currently  comparing  our  results  with  those  obtained  by  other  groups,  which  use  different  discretization 
techniques,  to  ensure  that  the  finite  element  method  is  ‘capturing’  the  right  information. 

We  also  compared  the  use  of  Newton’s  method  with  the  FAS  (nonlinear  multigrid)  scheme.  We  found 
that  when  the  jumps  in  the  coefficients  were  not  too  large  both  methods  performed  well.  However,  when  the 
size  of  the  jumps  was  increased  we  needed  to  modify  our  algorithms.  In  particular,  for  the  FAS  scheme  to 
work  properly,  the  equation  on  the  coarsest  grid  had  to  be  solved  to  a  high  degree  of  accuracy.  For  Newton’s 
method,  automatically  calculating  the  step  size  greatly  reduced  the  number  of  iterations.  Furthermore,  the 
use  of  adaptive  refinement  helped  the  solution  process  as  the  approximation  calculated  on  the  coarser  grids 
gave  a  good  initial  guess  to  the  solution  on  the  current  grid.  As  the  system  of  three  equations  also  contains 
large  jumps  in  the  coefficients,  we  believe  that  the  techniques  and  methods  which  we  have  shown  to  work 
here  will  be  a  good  starting  point  when  we  try  to  solve  the  system. 

We  ran  the  code  on  a  network  of  workstations  and  verified  that  we  get  the  same  mathematical  results 
as  though  it  were  run  in  parallel.  However,  we  do  not  have  any  parallel  efficiency  results  yet.  One  of  our 
next  goals  is  to  test  the  parallel  efficiency  of  our  approach. 

The  form  of  the  nonlinear  term  in  radiation  transport  equations  can  vary.  So  far  we  have  only  considered 
the  weakest  or  least  nonlinear  form.  We  would  also  like  to  rerun  our  experiments  using  the  other  forms  of 
the  equations. 

This  research  was  conducted  in  collaboration  with  David  Keyes  and  Alex  Pothen  (Old  Dominion  Uni¬ 
versity  and  ICASE)  and  Dimitri  Mavriplis  (ICASE)  as  part  of  an  ASCI  project. 

HANS  ZIMA 

Feedback- directed  and  Adaptive  Compilation 

Traditionally,  compilation  has  been  seen  as  a  batch  process,  in  which  a  high-level  language  is  translated 
into  a  machine  or  assembly  language  executable  on  a  given  target  machine.  Compilation  is  performed 
in  a  given  machine/system  environment  known  to  the  compiler,  which  can  be  exploited  for  optimizing 
the  target  program.  If  the  environment  is  not  known  at  compilation  time,  or  if  it  may  change  during 
execution,  the  target  program  has  to  be  parameterized  accordingly.  The  late  binding  associated  with  such 
a  parameterization  guarantees  flexibility  on  the  one  hand,  but  on  the  other  hand  may  result  in  less  efficient 
code  if  compared  to  an  early  binding  approach.  The  objective  of  this  study  is  to  examine  the  changing  role 
of  the  compiler  in  modern  computing  environments  and  its  interrelationship  with  performance  tools. 
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The  traditional  view  of  compilation  can  no  longer  be  maintained,  for  reasons  due  to  the  evolution  of 
computing  systems,  languages,  and  compiling  techniques.  For  example,  in  a  heterogeneous  environment 
(which  may  encompass  the  whole  Internet),  a  client  may  send  a  source  program  (or  a  partially  translated 
intermediate  version  of  the  source)  to  a  remote  server  for  compilation  and  execution.  Similarly,  in  contrast  to 
traditional  static  compilation,  the  Java  Hot  Spot  virtual  machine  identifies  bottlenecks  during  interpretation 
of  a  Java  program,  and  optimizes  execution  by  performing  on-the-fly  compilation  to  native  code.  The 
inspector/ executor  approach,  which  is  being  routinely  used  for  the  runtime  optimization  of  parallel  loops 
in  high-level  languages,  is  an  example  for  runtime  compilation  using  feedback  based  on  information  gained 
during  execution.  Systems  such  as  ATLAS  and  FFTW  use  performance  feedback  to  optimize  the  code  for  a 
given  environment.  A  number  of  programming  systems  (such  as  the  AURORA  Compilation  Environment) 
use  performance  feedback  from  execution  traces  for  performance  tuning  in  the  compile/execute  cycle. 

We  are  currently  developing  a  taxonomy  of  the  existing  approaches  in  this  field.  Following  this,  we  will 
study  the  possibihty  of  extending  the  Vienna  Fortran  Compilation  system  and  related  performance  tools  to 
demonstrate  proof-of-concept  solutions  for  relevant  application  problems. 

This  research  was  conducted  in  collaboration  with  Piyush  Mehrotra  (ICASE). 
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Sidilkover,  David:  A  new  time-space  accurate  scheme  for  hyperbolic  problems  I:  Quasi-explicit  case.  ICASE 
Report  No.  98-25,  (NASA/CR-1998-208436),  October  27,  1998,  24  pages.  Submitted  to  Communications  in 
Applied  Analysis. 

This  paper  presents  a  new  discretization  scheme  for  hyperbolic  systems  of  conservations  laws.  It  satisfies 
the  TVD  property  and  relies  on  the  new  high-resolution  mechanism  which  is  compatible  with  the  genuinely 
multidimensional  approach  proposed  recently.  This  work  can  be  regarded  as  a  first  step  towards  extending 
the  genuinely  multidimensional  approach  to  unsteady  problems.  Discontinuity  capturing  capabilities  and 
accuracy  of  the  scheme  are  verified  by  a  set  of  numerical  tests. 

Mavriplis,  Dimitri  J.:  On  convergence  acceleration  techniques  for  unstructured  meshes.  ICASE  Report  No. 
98-44.  (NASA/CR-1998-208732),  November  2,  1998,  35  pages. 

A  discussion  of  convergence  acceleration  techniques  as  they  relate  to  computational  fluid  dynamics  prob¬ 
lems  on  unstructured  meshes  is  given.  Rather  than  providing  a  detailed  description  of  particular  methods, 
the  various  different  building  blocks  of  current  solution  techniques  are  discussed  and  examples  of  solution 
strategies  using  one  or  several  of  these  ideas  are  given.  Issues  relating  to  unstructured  grid  CFD  problems 
are  given  additional  consideration,  including  suitabifity  of  algorithms  to  current  hardware  trends,  mem¬ 
ory  and  cpu  tradeoffs,  treatment  of  nonlinearities,  and  the  development  of  efficient  strategies  for  handling 
anisotropy-induced  stiffness.  The  outlook  for  future  potential  improvements  is  also  discussed. 

Povitsky,  A.:  Parallel  directionally  split  solver  based  on  reformulation  of  pipelined  Thomas  algorithm. 
ICASE  Report  No.  98-45,  (NASA/CR-1998-208733),  October  27,  1998,  30  pages.  To  be  submitted  to  SIAM 
Journal  of  Scientific  Computing. 

In  this  research  an  efficient  parallel  algorithm  for  3-D  directionally  split  problems  is  developed.  The 
proposed  algorithm  is  based  on  a  reformulated  version  of  the  pipelined  Thomas  algorithm  that  starts  the 
backward  step  computations  immediately  after  the  completion  of  the  forward  step  computations  for  the  first 
portion  of  lines.  This  algorithm  has  data  available  for  other  computational  tasks  while  processors  are  idle 
from  the  Thomas  algorithm. 

The  proposed  3-D  directionally  spht  solver  is  based  on  the  static  scheduhng  of  processors  where  local  and 
non-local,  data-dependent  and  data-independent  computations  are  scheduled  while  processors  are  idle.  A 
theoretical  model  of  parallehzation  efficiency  is  used  to  define  optimal  parameters  of  the  algorithm,  to  show 
an  asymptotic  parallelization  penalty  and  to  obtain  an  optimal  cover  of  a  global  domain  with  subdomains. 

It  is  shown  by  conaputational  experiments  and  by  the  theoretical  model  that  the  proposed  algorithm 
reduces  the  parallelization  penalty  about  two  times  over  the  basic  algorithm  for  the  range  of  the  number  of 
processors  (subdomains)  considered  and  the  number  of  grid  nodes  per  subdomain. 


28 


Chow,  RL.,  and  L.  Maestrello:  Vibrational  control  of  a  nonlinear  elastic  panel.  ICASE  Report  No.  98-46, 
(NAS A/CR- 1998-208734),  November  5,  1998,  19  pages.  To  be  submitted  to  the  Journal  of  the  Acoustic 
Society  of  America. 

The  paper  is  concerned  with  the  stabilization  of  the  nonlinear  panel  oscillation  by  an  active  control.  The 
control  is  actuated  by  a  combination  of  additive  and  parametric  vibrational  forces.  A  general  method  of 
vibrational  control  is  presented  for  stabilizing  panel  vibration  satisfying  a  nonlinear  beam  equation.  To  obtain 
analytical  results,  a  perturbation  technique  is  used  in  the  case  of  weak  nonlinearity.  Possible  application  to 
the  other  type  of  problems  is  briefly  discussed. 

Booker,  Andrew  J.,  J.E.  Dennis,  Jr.,  Paul  D.  Prank,  David  B.  Seraflni,  Virginia  Torczon,  and  Michael  W. 
Trosset:  A  rigorous  framework  for  optimization  of  expensive  functions  by  surrogates.  ICASE  Report  No. 
98-47,  (NAS A/CR- 1998-208735),  November  5,  1998,  24  pages.  To  appear  in  Structural  Optimization. 

The  goal  of  the  research  reported  here  is  to  develop  rigorous  optimization  algorithms  to  apply  to  some 
engineering  design  problems  for  which  design  application  of  traditional  optimization  approaches  is  not  prac¬ 
tical.  This  paper  presents  and  analyzes  a  framework  for  generating  a  sequence  of  approximations  to  the 
objective  function  and  managing  the  use  of  these  approximations  as  surrogates  for  optimization.  The  result 
is  to  obtain  convergence  to  a  minimizer  of  an  expensive  objective  function  subject  to  simple  constraints.  The 
approach  is  widely  applicable  because  it  does  not  require,  or  even  explicitly  approximate,  derivatives  of  the 
objective.  Numerical  results  are  presented  for  a  31-variable  helicopter  rotor  blade  design  example  and  for  a 
standard  optimization  test  example. 

Povitsky,  A.:  Parallelization  of  the  pipelined  Thomas  algorithm.  ICASE  Report  No.  98-48,  (NASA/CR- 
1998-208736),  December  3,  1998,  26  pages.  Submitted  to  the  Journal  of  Parallel  and  Distributed  Computing. 

In  this  study  the  following  questions  are  addressed.  Is  it  possible  to  improve  the  parallelization  efficiency 
of  the  Thomas  algorithm?  How  should  the  Thomas  algorithm  be  formulated  in  order  to  get  solved  Hnes  that 
are  used  as  data  for  other  computational  tasks  while  processors  are  idle? 

To  answer  these  questions,  two-step  pipelined  algorithms  (PAs)  are  introduced  formally.  It  is  shown 
that  the  idle  processor  time  is  invariant  with  respect  to  the  order  of  backward  and  forward  steps  in  PAs 
starting  from  one  outermost  processor.  The  advantage  of  PAs  starting  from  two  outermost  processors  is 
small.  Versions  of  the  pipelined  Thomas  algorithms  considered  here  fall  into  the  category  of  PAs, 

These  results  show  that  the  parallelization  efficiency  of  the  Thomas  algorithm  cannot  be  improved  di¬ 
rectly.  However,  the  processor  idle  time  can  be  used  if  some  data  has  been  computed  by  the  time  processors 
become  idle.  To  achieve  this  goal  the  Immediate  Backward  pipehned  Thomas  Algorithm  (IB-PTA)  is  devel¬ 
oped  in  this  article.  The  backward  step  is  computed  immediately  after  the  forward  step  has  been  completed 
for  the  first  portion  of  lines.  This  enables  the  completion  of  the  Thomas  algorithm  for  some  of  these  lines  be¬ 
fore  processors  become  idle.  An  algorithm  for  generating  a  static  processor  schedule  recursively  is  developed. 
This  schedule  is  used  to  switch  between  forward  and  backward  computations  and  to  control  communications 
between  processors.  The  advantage  of  the  IB-PTA  over  the  basic  PTA  is  the  presence  of  solved  lines,  which 
are  available  for  other  computations,  by  the  time  processors  become  idle. 
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Rubinstein,  Robert,  and  Ye  Zhou:  Effects  of  helicity  on  Lagrangian  and  Eulerian  time  correlations  in  tur¬ 
bulence,  ICASE  Report  No.  98-49,  (NASA/CR-1998-208737),  November  5,  1998,  10  pages.  Submitted  to 
Physics  of  Fluids. 

Taylor  series  expansions  of  turbulent  time  correlation  functions  are  applied  to  show  that  helicity  influ¬ 
ences  Eulerian  time  correlations  more  strongly  than  Lagrangian  time  correlations:  to  second  order  in  time, 
the  helicity  effect  on  Lagrangian  time  correlations  vanishes,  but  the  helicity  effect  on  Eulerian  time  correla¬ 
tions  is  nonzero.  Fourier  analysis  shows  that  the  hehcity  effect  on  Eulerian  time  correlations  is  confined  to 
the  largest  inertial  range  scales.  Some  implications  for  sound  radiation  by  swirling  flows  are  discussed. 

Abarbanel,  Saul,  Adi  Ditkowski,  and  Amir  Yefet:  Bounded  error  schemes  for  the  wave  equation  on  complex 
domains,  ICASE  Report  No.  98-50,  (NASA/CR-1998-208740),  November  20,  1998,  17  pages.  Submitted  to 
IEEE  Trans.  Antennas  Propagat. 

This  paper  considers  the  application  of  the  method  of  boundary  penalty  terms  (“SAT”)  to  the  numerical 
solution  of  the  wave  equation  on  complex  shapes  with  Dirichlet  boundary  conditions.  A  theory  is  developed, 
in  a  semi-discrete  setting,  that  allows  the  use  of  a  Cartesian  grid  on  complex  geometries,  yet  maintains  the 
order  of  accuracy  with  only  a  linear  temporal  error-bound.  A  numerical  example,  involving  the  solution  of 
Maxwell’s  equations  inside  a  2-D  circular  wave-guide  demonstrates  the  efficacy  of  this  method  in  comparison 
to  others  (e.g.,  the  staggered  Yee  scheme)  -  we  achieve  a  decrease  of  two  orders  of  magnitude  in  the  level  of 
the  L2~error. 

Darmofal,  David:  Eigenmode  analysis  of  boundary  conditions  for  the  one- dimensional  preconditioned  Eu¬ 
ler  equations,  ICASE  Report  No.  98-51,  (NASA/CR-1998-208741),  November  20,  1998,  15  pages.  To  be 
submitted  to  AIAA  1999  CFD  Conference  and  SIAM  Journal  of  Numerical  Analysis. 

An  analysis  of  the  effect  of  local  preconditioning  on  boundary  conditions  for  the  subsonic,  one-dimensional 
Euler  equations  is  presented.  Decay  rates  for  the  eigenmodes  of  the  initial  boundary  value  problem  are 
determined  for  different  boundary  conditions.  Riemann  invariant  boundary  conditions  based  on  the  unpre¬ 
conditioned  Euler  equations  are  shown  to  be  reflective  with  preconditioning,  and,  at  low  Mach  numbers, 
disturbances  do  not  decay.  Other  boundary  conditions  are  investigated  which  are  non-reflective  with  pre¬ 
conditioning  and  numerical  results  are  presented  confirming  the  analysis. 

Tsynkov,  Semyon,  Saul  Abarbanel,  Jan  Nordstrom,  Viktor  Ryaben’kii,  and  Veer  Vatsa:  Global  artificial 
boundary  conditions  for  computation  of  external  flow  problems  with  propulsive  jets,  ICASE  Report  No.  98-52, 
(N AS A/CR- 1998-208746),  December  3,  1998,  25  pages.  Submitted  to  the  14th  AIAA  CFD  Conference. 

We  propose  new  global  artificial  boundary  conditions  (ABC’s)  for  computation  of  flows  with  propulsive 
jets.  The  algorithm  is  based  on  application  of  the  difference  potentials  method  (DPM).  Previously,  similar 
boundary  conditions  have  been  implemented  for  calculation  of  external  compressible  viscous  flows  around 
finite  bodies.  The  proposed  modification  substantially  extends  the  applicability  range  of  the  DPM-based 
algorithm.  In  the  paper,  we  present  the  general  formulation  of  the  problem,  describe  our  numerical  method¬ 
ology,  and  discuss  the  corresponding  computational  results.  The  particular  configuration  that  we  analyze  is 
a  slender  three-dimensional  body  with  boat-tail  geometry  and  supersonic  jet  exhaust  in  a  subsonic  external 
flow  under  zero  angle  of  attack.  Similarly  to  the  results  obtained  earlier  for  the  flows  around  airfoils  and 
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wings,  current  results  for  the  jet  flow  case  corroborate  the  superiority  of  the  DPM-based  ABC’s  over  standard 
local  methodologies  from  the  standpoints  of  accuracy,  overall  numerical  performance,  and  robustness. 

Xu,  Kun:  Gas-kinetic  theory  based  flux  splitting  method  for  ideal  magnetohydrodynamics.  ICASE  Report  No. 
98-53.  (NASA/1998-208747),  December  3,  1998,  22  pages.  To  be  submitted  to  the  Journal  of  Computational 
Physics. 

A  gas-kinetic  solver  is  developed  for  the  ideal  magnetohydrodynamics  (MHD)  equations.  The  new 
scheme  is  based  on  the  direct  splitting  of  the  flux  function  of  the  MHD  equations  with  the  inclusion  of 
“particle”  collisions  in  the  transport  process.  Consequently,  the  artificial  dissipation  in  the  new  scheme  is 
much  reduced  in  comparison  with  the  MHD  Flux  Vector  Splitting  Scheme.  At  the  same  time,  the  new 
scheme  is  compared  with  the  well-developed  Roe-type  MHD  solver.  It  is  concluded  that  the  kinetic  MHD 
scheme  is  more  robust  and  efiicient  than  the  Roe-type  method,  and  the  accuracy  is  competitive.  In  this  paper 
the  general  principle  of  splitting  the  macroscopic  flux  function  based  on  the  gas-kinetic  theory  is  presented. 
The  flux  construction  strategy  may  shed  some  light  on  the  possible  modification  of  AUSM-  and  CUSP-t3q)e 
schemes  for  the  compressible  Euler  equations,  as  well  as  to  the  development  of  new  schemes  for  a  non-strictly 
hyperbohc  system. 

Holt,  Maurice:  3D  characteristics.  ICASE  Report  No.  98-54,  (NASA/CR- 1998-208958),  December  23, 1998, 
10  pages.  Submitted  to  Springer  Series  in  Computational  Physics. 

Contributions  to  the  Method  of  Characteristics  in  Three  Dimensions,  which  previously  received  incom¬ 
plete  recognition,  are  reviewed.  They  mostly  follow  from  a  fundamental  paper  by  Rusanov  which  led  to 
several  developments  in  Russia,  described  by  Chushkin. 

Lian,  Yongsheng,  and  Kun  Xu:  A  gas-kinetic  scheme  for  reactive  flows.  ICASE  Report  No.  98-55,  (NASA/CR- 
1998-208963),  December  23,  1998,  16  pages.  To  be  submitted  to  Computers  and  Fluids. 

In  this  paper,  the  gas-kinetic  BGK  scheme  for  the  compressible  flow  equations  is  extended  to  chemical 
reactive  flow.  The  mass  fraction  of  the  unburnt  gas  is  implemented  into  the  gas  kinetic  equation  by  assigning 
a  new  internal  degree  of  freedom  to  the  particle  distribution  function.  The  new  variable  can  be  also  used  to 
describe  fluid  trajectory  for  the  nonreactive  flows.  Due  to  the  gas-kinetic  BGK  model,  the  current  scheme 
basically  solves  the  Navier-Stokes  chemical  reactive  flow  equations.  Numerical  tests  validate  the  accuracy 
and  robustness  of  the  current  kinetic  method. 

Xu,  Kun,  and  Shiu-Hong  Lui:  Rayleigh- Benard  simulation  using  gas-kinetic  BGK  scheme  in  the  incompress¬ 
ible  limit.  ICASE  Report  No.  98-56,  (NASA/CR- 1998-208964),  December  23,  1998,  19  pages.  Submitted  to 
Physical  Review  E. 

In  this  paper,  a  gas-kinetic  BGK  model  is  constructed  for  the  Rayleigh-Benard  thermal  convection  in 
the  incompressible  flow  limit,  where  the  flow  field  and  temperature  field  are  described  by  two  coupled  BGK 
models.  Since  the  collision  times  and  pseudo-temperature  in  the  corresponding  BGK  models  can  be  different, 
the  Prandtl  number  can  be  changed  to  any  value  instead  of  a  fixed  Pr=l  in  the  original  BGK  model.  The 
2D  Rayleigh-Benard  thermal  convection  is  studied  and  numerical  results  are  compared  with  theoretical  ones 
as  well  as  other  simulation  results. 
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Lewis,  Robert  Michael,  Virginia  Torczon,  and  Michael  W.  Trosset:  Why  pattern  search  works,  ICASE  Report 
No.  98-57,  (NASA/CR-1998-208966),  December  23, 1998, 17  pages.  To  appear  in  Optima,  The  Mathematical 
Programming  Society  Newsletter. 

Pattern  search  methods  are  a  class  of  direct  search  methods  for  nonlinear  optimization.  Since  the 
introduction  of  the  original  pattern  search  methods  in  the  late  1950s  and  early  1960s,  they  have  remained 
popular  with  users  due  to  their  simplicity  and  the  fact  that  they  work  well  in  practice  on  a  variety  of 
problems.  More  recently,  the  fact  that  they  are  provably  convergent  has  generated  renewed  interest  in  the 
nonlinear  programming  community.  The  purpose  of  this  article  is  to  describe  what  pattern  search  methods 
are  and  why  they  work. 

Kumfert,  Gary,  and  Alex  Pothen:  An  object-oriented  collection  of  minimum  degree  algorithms:  Design, 
implementation,  and  experiences,  ICASE  Report  No.  99-1,  (NASA/CR-1999-208977),  January  29,  1999,  15 
pages.  In  Computing  in  Object-oriented  Parallel  Environments,  Lecture  Notes  in  Computer  Science  1505. 

The  multiple  minimum  degree  (MMD)  algorithm  and  its  variants  have  enjoyed  20-|-  years  of  research 
and  progress  in  generating  fill-reducing  orderings  for  sparse,  S5anmetric  positive  definite  matrices.  Although 
conceptually  simple,  efficient  implementations  of  these  algorithms  are  deceptively  complex  and  highly  spe¬ 
cialized. 

In  this  case  study,  we  present  an  object-oriented  library  that  implements  several  recent  minimum  degree¬ 
like  algorithms.  We  discuss  how  object-oriented  design  forces  us  to  decompose  these  algorithms  in  a  different 
manner  than  earher  codes  and  demonstrate  how  this  impacts  the  flexibility  and  efficiency  of  our  C++ 
implementation.  We  compare  the  performance  of  our  code  against  other  implementations  in  C  or  Fortran. 

Dobrian,  Florin,  Gary  Kumfert,  and  Alex  Pothen:  Object-oriented  design  for  sparse  direct  solvers.  ICASE 
Report  No.  99-2,  (NASA/CR-1999-208978),  January  20,  1999,  12  pages.  In  Computing  in  Object-oriented 
Parallel  Environments,  Lecture  Notes  in  Computer  Science  1505. 

We  discuss  the  object-oriented  design  of  a  software  package  for  solving  sparse,  symmetric  systems  of 
equations  (positive  definite  and  indefinite)  by  direct  methods.  At  the  highest  layers,  we  decouple  data 
structure  classes  from  algorithmic  classes  for  flexibility.  We  describe  the  important  structural  and  algorithmic 
classes  in  our  design,  and  discuss  the  trade-offs  we  made  for  high  performance.  The  kernels  at  the  lower 
layers  were  optimized  by  hand.  Our  results  show  no  performance  loss  from  our  object-oriented  design,  while 
providing  flexibihty,  ease  of  use,  and  extensibihty  over  solvers  using  procedural  design. 

Cleaveland,  Ranee,  Gerald  Liittgen,  and  V.  Natarajan:  Priority  in  process  algebras,  ICASE  Report  No.  99-3, 
(NASA/CR-1999-208979),  January  25,  1999,  48  pages.  To  appear  in  Handbook  of  Process  Algebras. 

This  paper  surveys  the  semantic  ramifications  of  extending  traditional  process  algebras  with  notions  of 
priority  that  allow  for  some  transitions  to  be  given  precedence  over  others.  These  enriched  formalisms  allow 
one  to  model  system  features  such  as  interrupts,  prioritized  choice,  or  real-time  behavior. 

Approaches  to  priority  in  process  algebras  can  be  classified  according  to  whether  the  induced  notion  of 
pre-emption  on  transitions  is  global  or  local  and  whether  priorities  are  static  or  dynamic.  Early  work  in  the 
area  concentrated  on  global  pre-emption  and  static  priorities  and  led  to  formalisms  for  modeling  interrupts 
and  aspects  of  real-time,  such  as  maximal  progress,  in  centralized  computing  environments.  More  recent 


32 


research  has  investigated  localized  notions  of  pre-emption  in  which  the  distribution  of  systems  is  taken  into 
account,  as  well  as  d3aiamic  priority  approaches,  i.e.,  those  where  priority  values  may  change  as  systems 
evolve.  The  latter  allows  one  to  model  behavioral  phenomena  such  as  scheduling  algorithms  and  also  enables 
the  efficient  encoding  of  real-time  semantics. 

Technically,  this  paper  studies  the  different  models  of  priorities  by  presenting  extensions  of  Milner’s 
Calculus  of  Communicating  Systems  (CCS)  with  static  and  dynamic  priority  as  well  as  with  notions  of 
global  and  local  pre-emption.  In  each  case  the  operational  semantics  of  CCS  is  modified  appropriately, 
behavioral  theories  based  on  strong  and  weak  bisimulation  are  given,  and  related  approaches  for  different 
process-algebraic  settings  are  discussed. 

Liittgen,  Gerald,  Girish  Bhat,  and  Ranee  Cleaveland:  A  practical  approach  to  implementing  real-time  se¬ 
mantics.  ICASE  Report  No.  99-4,  (N AS A/CR- 1999-208980),  January  25,  1999,  33  pages.  To  appear  in 
Annals  of  Software  Engineering. 

This  paper  investigates  implementations  of  process  algebras  which  are  suitable  for  modeling  concurrent 
real-time  systems.  It  suggests  an  approach  for  efficiently  implementing  real-time  semantics  using  dynamic 
priorities.  For  this  purpose  a  process  algebra  with  dynamic  priority  is  defined,  whose  semantics  corresponds 
one-to-one  to  traditional  real-time  semantics.  The  advantage  of  the  dynamic- priority  approach  is  that  it 
drastically  reduces  the  state-space  sizes  of  the  systems  in  question  while  preserving  all  properties  of  their 
functional  and  real-time  behavior. 

The  utihty  of  the  technique*  is  demonstrated  by  a  case  study  which  deals  with  the  formal  modeling 
and  verification  of  the  SCSI-2  bus-protocol.  The  case  study  is  carried  out  in  the  Concurrency  Workbench 
of  North  Carolina,  an  automated  verification  tool  in  which  the  process  algebra  with  dynamic  priority  is 
implemented.  It  turns  out  that  the  state  space  of  the  bus-protocol  model  is  about  an  order  of  magnitude 
smaller  than  the  one  resulting  from  real-time  semantics.  The  accuracy  of  the  model  is  proved  by  applying 
model  checking  for  verifying  several  mandatory  properties  of  the  bus  protocol. 

Lui,  Shiuhong,  and  Kun  Xu:  Entropy  analysis  of  kinetic  flux  vector  splitting  schemes  for  the  compressible 
Euler  equations.  ICASE  Report  No.  99-5,  (NAS A/CR- 1999-208981),  January  29,  1999,  18  pages. 

Flux  Vector  Splitting  (FVS)  scheme  is  one  group  of  approximate  Riemann  solvers  for  the  compressible 
Euler  equations.  In  this  paper,  the  discretized  entropy  condition  of  the  Kinetic  Flux  Vector  Splitting  (KFVS) 
scheme  based  on  the  gas-kinetic  theory  is  proved.  The  proof  of  the  entropy  condition  involves  the  entropy 
definition  difference  between  the  distinguishable  and  indistinguishable  particles. 

Xu,  Kun:  Gas  evolution  dynamics  in  Godunov-type  schemes  and  analysis  of  numerical  shock  instability. 
ICASE  Report  No.  99-6,  (NASA/CR-1999-208985),  January  28,  1999,  21  pages.  To  be  submitted  to  the 
International  Journal  of  Numerical  Methods  in  Fluids. 

In  this  paper  we  are  going  to  study  the  gas  evolution  dynamics  of  the  exact  and  approximate  Riemann 
solvers,  e.g.,  the  Flux  Vector  Splitting  (FVS)  and  the  Flux  Difference  Splitting  (FDS)  schemes.  Since  the 
FVS  scheme  and  the  Kinetic  Flux  Vector  Sphtting  (KFVS)  scheme  have  the  same  physical  mechanism  and 
similar  flux  function,  based  on  the  analysis  of  the  discretized  KFVS  scheme  the  weakness  and  advantage  of 
the  FVS  scheme  are  closely  observed.  The  subtle  dissipative  mechanism  of  the  Godunov  method  in  the  2D 
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case  is  also  analyzed,  and  the  physical  reason  for  shock  instability,  i.e.,  carbuncle  phenomena  and  odd-even 
decoupling,  is  presented. 

Rubinstein,  Robert:  Double  resonance  and  spectral  scaling  in  the  weak  turbulence  theory  of  rotating  and  strat¬ 
ified  turbulence.  ICASE  Report  No.  99-7,  (NASA/CR-1999-208996),  February  5,  1999,  19  pages.  Submitted 
to  Physical  Review  E. 

In  rotating  turbulence,  stably  stratified  turbulence,  and  in  rotating  stratified  turbulence,  heuristic  ar¬ 
guments  concerning  the  turbulent  time  scale  suggest  that  the  inertial  range  energy  spectrum  scales  as  fe“^. 
Prom  the  viewpoint  of  weak  turbulence  theory,  there  are  three  possibilities  which  might  invalidate  these  ar¬ 
guments:  four- wave  interactions  could  dominate  three- wave  interactions  leading  to  a  modified  inertial  range 
energy  balance,  double  resonances  could  alter  the  time  scale,  and  the  energy  flux  integral  might  not  converge. 
It  is  shown  that  although  double  resonances  exist  in  all  of  these  problems,  they  do  not  influence  overall  en¬ 
ergy  transfer.  However,  the  resonance  conditions  cause  the  flux  integral  for  rotating  turbulence  to  diverge 
logarithmically  when  evaluated  for  a  energy  spectrum;  therefore,  this  spectrum  requires  logarithmic 
corrections.  Finally,  the  role  of  four-wave  interactions  is  briefly  discussed. 

Rubinstein,  Robert,  and  Ye  Zhou:  The  dissipation  range  in  rotating  turbulence.  ICASE  Report  No.  99-8, 
(NASA/CR-1999-208997),  February  5,  1999,  11  pages.  Submitted  to  Physical  Review  E. 

The  dissipation  range  energy  balance  of  the  direct  interaction  approximation  is  applied  to  rotating 
turbulence  when  rotation  effects  persist  well  into  the  dissipation  range.  Assuming  that  RoRe^^^  <<  1 
and  that  three- wave  interactions  are  dominant,  the  dissipation  range  is  foimd  to  be  concentrated  in  the 
wavevector  plane  perpendicular  to  the  rotation  sixis.  This  conclusion  is  consistent  with  previous  analyses 
of  inertial  range  energy  transfer  in  rotating  turbulence,  which  predict  the  accumulation  of  energy  in  those 
scales. 

Mavriplis,  Dimitri  J.,  and  S.  Pirzadeh:  Large-scale  parallel  unstructured  mesh  computations  for  3D  high-lift 
analysis.  ICASE  Report  No.  99-9,  (NASA/CR-1999-208999),  February  11,  1999,  26  pages.  Submitted  to 
AIAA  Journal  of  Aircraft. 

A  complete  “geometry  to  drag-polar”  analysis  capability  for  the  three-dimensional  high-lift  configurations 
is  described.  The  approach  is  based  on  the  use  of  unstructured  meshes  in  order  to  enable  rapid  turnaround 
for  complicated  geometries  that  arise  in  high-lift  configurations.  Special  attention  is  devoted  to  creating  a 
capability  for  enabling  analyses  on  highly  resolved  grids.  Unstructured  meshes  of  several  million  vertices  are 
initially  generated  on  a  work-station,  and  subsequently  refined  on  a  supercomputer.  The  flow  is  solved  on 
these  refined  meshes  on  large  parallel  computers  using  an  unstructured  agglomeration  multigrid  algorithm. 
Good  prediction  of  lift  and  drag  throughout  the  range  of  incidences  is  demonstrated  on  a  transport  take-off 
configuration  using  up  to  24.7  million  grid  points.  The  feasibility  of  using  this  approach  in  a  production 
environment  on  existing  parallel  machines  is  demonstrated,  as  well  as  the  scalability  of  the  solver  on  machines 
using  up  to  1450  processors. 
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Xu,  Kun,  and  Li-Shi  Luo:  Connection  between  the  lattice  Boltzmann  equation  and  the  beam  scheme.  ICASE 
Report  No.  99-10,  (NASA/CR-1999-209001),  February  12,  1999,  14  pages.  Submitted  to  Physical  Review 
E. 


In  this  paper  we  analyze  and  compare  the  lattice  Boltzmann  equation  with  the  beam  scheme  in  details. 
We  notice  the  similarity  and  differences  between  the  lattice  Boltzmann  equation  and  the  beam  scheme.  We 
show  that  the  accuracy  of  the  lattice  Boltzmann  equation  is  indeed  second  order  in  space.  We  discuss  the 
advantages  and  limitations  of  lattice  Boltzmann  equation  and  the  beam  scheme.  Based  on  our  analysis,  we 
propose  an  improved  multi-dimensional  beam  scheme. 

Baggag,  Abdelkader,  Harold  Atkins,  Can  Ozturan,  and  David  Keyes:  Parallelization  of  an  object-oriented 
unstructured  aeroacoustics  solver.  ICASE  Report  No.  99-11,  (NASA/CR-1999-209098),  February  16,  1999, 
16  pages.  Submitted  to  the  Proceedings  of  the  9th  SIAM  Conference  on  Parallel  Processing  for  Scientific 
Computing. 

A  computational  aeroacoustics  code  based  on  the  discontinuous  Galerkin  method  is  ported  to  several 
parallel  platforms  using  MPI.  The  discontinuous  Galerkin  method  is  a  compact  high-order  method  that 
retains  its  accuracy  and  robustness  on  non-smooth  unstructured  meshes.  In  its  semi-discrete  form,  the 
discontinuous  Galerkin  method  can  be  combined  with  explicit  time  marching  methods  making  it  well  suited 
to  time  accurate  computations.  The  compact  nature  of  the  discontinuous  Galerkin  method  also  makes  it 
well  suited  for  distributed  memory  parallel  platforms.  The  original  serial  code  was  written  using  an  object- 
oriented  approach  and  was  previously  optimized  for  cache-based  machines.  The  port  to  parallel  platforms 
was  achieved  simply  by  treating  partition  boundaries  as  a  type  of  boundary  condition.  Code  modifications 
were  minimal  because  boimdary  conditions  were  abstractions  in  the  original  program.  Scalability  results 
are  presented  for  the  SGI  Origin,  IBM  SP2,  and  clusters  of  SGI  and  Sun  workstations.  Slightly  superlinear 
speedup  is  achieved  on  a  fixed-size  problem  on  the  Origin,  due  to  cache  effects. 

Arian,  E.,  A.  Batterman,  and  E.W.  Sachs:  Approximation  of  the  Newton  step  by  a  defect  correction  process. 
ICASE  Report  No.  99-12,  (NASA/CR-1999-209099),  February  16, 1999,  35  pages.  To  be  submitted  to  SIAM 
Journal  of  Optimization. 

In  this  paper,  an  optimal  control  problem  governed  by  a  partial  differential  equation  is  considered.  The 
Newton  step  for  this  system  can  be  computed  by  solving  a  coupled  system  of  equations.  To  do  this  efficiently 
with  an  iterative  defect  correction  process,  a  modifying  operator  is  introduced  into  the  system.  This  operator 
is  motivated  by  local  mode  analysis.  The  operator  can  be  used  also  for  preconditioning  in  GMRES.  We  give 
a  detailed  convergence  analysis  for  the  defect  correction  process  and  show  the  derivation  of  the  modifying 
operator.  Numerical  tests  are  done  on  the  small  disturbance  shape  optimization  problem  in  two  dimensions 
for  the  defect  correction  process  and  for  GMRES. 

Rubinstein,  Robert,  and  Aaron  H.  Auslender:  Relaxation  from  steady  states  far  from  equilibrium  and  the 
persistence  of  anomalous  shock  behavior  in  weakly  ionized  gases.  ICASE  Report  No.  99-13,  (NASA/CR- 
1999-209105),  March  3,  1999,  13  pages.  To  be  submitted  to  Physical  Review  E. 

The  decay  of  anomalous  effects  on  shock  waves  in  weakly  ionized  gases  following  plasma  generator 
extinction  has  been  measured  in  the  anticipation  that  the  decay  time  must  correlate  well  with  the  relaxation 
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time  of  the  mechanism  responsible  for  the  anomalous  effects.  When  the  relaxation  times  cannot  be  measured 
directly,  they  are  inferred  theoretically,  usually  assuming  that  the  initial  state  is  nearly  in  thermal  equilibrium. 
In  this  paper,  it  is  demonstrated  that  relaxation  from  any  steady  state  far  from  equilibrium,  including  the 
state  of  a  weakly  ionized  gas,  can  proceed  much  more  slowly  than  arguments  based  on  relaxation  from 
near  equilibrium  states  might  suggest.  This  result  justifies  a  more  careful  analysis  of  the  relaxation  times 
in  weakly  ionized  gases  and  suggests  that  although  the  experimental  measurements  of  relaxation  times  did 
not  lead  to  an  unambiguous  conclusion,  this  approach  to  understanding  the  anomalous  effects  may  warrant 
further  investigation. 

Diskin,  Boris,  and  James  L.  Thomas:  Solving  upwind-biased  discretizations:  Defect-correction  iterations, 
ICASE  Report  No.  99-14,  (NASA/CR-1999-209106),  March  3,  1999,  27  pages.  To  be  submitted  to  the 
SIAM  Journal  of  Scientific  Computing. 

This  paper  considers  defect-correction  solvers  for  a  second  order  upwind-biased  discretization  of  the  2D 
convection  equation.  The  following  important  features  are  reported 

1.  The  asymptotic  convergence  rate  is  about  0.5  per  defect-correction  iteration. 

2.  If  the  operators  involved  in  defect-correction  iterations  have  different  approximation  order,  then 
the  initial  convergence  rates  may  be  very  slow.  The  number  of  iterations  required  to  get  into  the 
asymptotic  convergence  regime  might  grow  on  fine  grids  as  a  negative  power  of  h.  In  the  case  of  a 
second  order  target  operator  and  a  first  order  driver  operator,  this  number  of  iterations  is  roughly 
proportional  to 

3.  If  both  the  operators  have  the  second  approximation  order,  the  defect-correction  solver  demonstrates 
the  asymptotic  convergence  rate  after  three  iterations  at  most.  The  same  three  iterations  are  required 
to  converge  algebraic  error  below  the  truncation  error  level. 

A  novel  comprehensive  half-space  Fourier  mode  analysis  (which,  by  the  way,  can  take  into  account  the 
influence  of  discretized  outflow  boundary  conditions  as  well)  for  the  defect-correction  method  is  developed. 
This  analysis  explains  many  phenomena  observed  in  solving  non-elliptic  equations  and  provides  a  close 
prediction  of  the  actual  solution  behavior.  It  predicts  the  convergence  rate  for  each  iteration  and  the 
asymptotic  convergence  rate.  As  a  result  of  this  analysis,  a  new  very  efficient  adaptive  multigrid  algorithm 
solving  the  discrete  problem  to  within  a  given  accuracy  is  proposed.  Numerical  simulations  confirm  the 
accuracy  of  the  analysis  and  the  efficiency  of  the  proposed  algorithm.  The  results  of  the  numerical  tests  are 
reported. 
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INTERIM  REPORTS 


Bokhari,  Shahid  H.,  and  Dimitri  J.  Mavriplis:  The  Tera  multithreaded  architecture  and  unstructured  meshes. 
ICASE  Interim  Report  No.  33,  (NASA/CR- 1998-208953),  December  11,  1998,  23  pages. 

The  Tera  Multithreaded  Architecture  (MTA)  is  a  new  parallel  supercomputer  currently  being  installed 
at  San  Diego  Supercomputing  Center  (SDSC).  This  machine  has  an  architecture  quite  different  from  contem¬ 
porary  parallel  machines.  The  computational  processor  is  a  custom  design  and  the  machine  uses  hardware 
to  support  very  fine  grained  multithreading.  The  main  memory  is  shared,  hardware  randomized  and  flat. 
These  features  make  the  machine  highly  suited  to  the  execution  of  unstructured  mesh  problems,  which  are 
difficult  to  parallelize  on  other  architectures. 

,  We  report  the  results  of  a  study  carried  out  during  July- August  1998  to  evaluate  the  execution  of  EUL3D, 
a  code  that  solves  the  Euler  equations  on  an  unstructured  mesh,  on  the  2  processor  Tera  MTA  at  SDSC. 

Our  investigation  shows  that  parallelization  of  an  unstructured  code  is  extremely  easy  on  the  Tera.  We 
were  able  to  get  an  existing  parallel  code  (designed  for  a  shared  memory  machine),  running  on  the  Tera 
by  changing  only  the  compiler  directives.  Furthermore,  a  serial  version  of  this  code  was  compiled  to  run 
in  parallel  on  the  Tera  by  judicious  use  of  directives  to  invoke  the  “fuU/empty”  tag  bits  of  the  machine  to 
obtain  synchronization.  This  version  achieves  212  and  406  Mflop/s  on  one  and  two  processors  respectively, 
and  requires  no  attention  to  partitioning  or  placement  of  data — issues  that  would  be  of  paramount  importance 
in  other  parallel  architectures. 


Yefet,  A.,  and  E.  Turkel:  Construction  of  three  dimensional  solutions  for  the  Maxwell  equations.  ICASE 
Interim  Report  No.  34,  (NASA/ CR- 1998- 208954),  December  11,  1998,  10  pages. 

We  consider  numerical  solutions  for  the  three  dimensional  time  dependent  Maxwell  equations.  We 
construct  a  fourth  order  accurate  compact  implicit  scheme  and  compare  it  to  the  Yee  scheme  for  free  space 
in  a  box. 


Trosset,  Michael  W.:  The  Krigifier:  A  procedure  for  generating  pseudorandom  nonlinear  objective  functions 
for  computational  experimentation.  ICASE  Interim  Report  No.  35,  (NASA/CR-1999-209000),  February  11, 
1999,  11  pages. 

Comprehensive  computational  experiments  to  assess  the  performance  of  algorithms  for  numerical  opti¬ 
mization  require  (among  other  things)  a  practical  procedure  for  generating  pseudorandom  nonlinear  objective 
functions.  We  propose  a  procedure  that  is  based  on  the  convenient  fiction  that  objective  functions  are  real¬ 
izations  of  stochastic  processes.  This  report  details  the  calculations  necessary  to  implement  our  procedure 
for  the  case  of  certain  stationary  Gaussian  processes  and  presents  a  specific  implementation  in  the  statistical 
programming  language  S-PLUS. 
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ICASE  COLLOQUIA 
October  1,  1998  -  March  31,  1999 


Name/Affiliation/Title 


Cleaveland,  Ranee,  State  University  of  New  York  at  Stony  Brook 
“An  Operational  Semantics  of  Temporal  Logic” 

Ewins,  David,  Imperial  College  of  Science,  Technology  &  Medicine,  London,  UK 
“Modal  Testing  on  the  Move” 

Bartz,  Dirk,  University  of  Tuebingen,  Germany 
“Constructing  Recursive  Tree  Hierarchies  in  Parallel” 

Ho,  Chih-Ming,  University  of  California,  Los  Angeles 
“M3  System  for  Aerodynamic  Control” 

Huettner,  Tobias,  University  of  Tuebingen,  Germany 
“High-resolution  Textures” 

Morris,  Kirsten,  University  of  Waterloo,  Canada 
“Achievable  Noise  Reduction  in  Ducts” 

Simpson,  Robert  W.,  Massachusetts  Institute  of  Technology 
“The  Problems  of  Quantifying  Safety  in  Aviation” 

Davies,  Roger,  University  of  Arizona 
“Progress  Towards  a  Panorama  of  Cloud  Reflectivity” 

Shapiro,  Gerald,  Logistics  Management  Institute,  McLean,  VA 
“Modeling  the  Safety  Implications  of  Aviation  Operations” 

Barnett,  Arnold,  Massachusetts  Institute  of  Technology 
“Aviation  Safety  in  Numbers” 

Stathopoulos,  Andreas,  The  College  of  William  &  Mary 
“Restarting  Techniques  for  Arnoldi-like  Eigenvalue  Methods” 

Curkendall,  David,  Jet  Propulsion  Laboratory 
“Data  Visualization  Corridors  and  the  Digital  Earth” 


Date 


October  1 


October  19 


October  26 


October  26 


October  26 


November  9 


November  10 


November  19 


November  19 


November  23 


November  23 


December  2 
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Name/ Affiliation/Title 


Date 


Stals,  Linda,  Old  Dominion  University 
“Parallel  Implementation  of  Multigrid  Methods  on  Unstructured  Grids” 

Beck,  Richard,  Miami  University,  Oxford,  OH 
“Gateway  to  the  Future:  OhioView  Pilot:  Test-bed  for  a  National  Space-based 
Geospatial  Information  Network” 

Hart,  Roger,  ICASE 

“Applications  of  Laser-induced  Thermal  Acoustics  to  Flow  Characterization” 

Munoz,  Cesar,  SRI  International,  Menlo  Park,  CA 
“Types  for  Software  Development” 

Mascagni,  Michael,  University  of  Southern  Mississippi 
“A  Deterministic  Particle  Method  for  One-dimensional  Reaction-diffusion  Equations” 

Lin,  Huimin,  Chinese  Academy  of  Sciences 
“The  ‘Symbolic’  Approach  to  Message-passing  Processes” 

Sobieski,  Jaroslaw,  NASA  Langley  Research  Center 
“Compute  as  Fast  as  Engineers  Can  Think” 

SidiDcover,  David,  ICASE 

“Towards  a  Unified  Approach  for  Numerical  Treatment  of  Incompressible  and 
Compressible  Steady  Flow” 

Rubinstein,  Robert,  ICASE 

“Characterization  of  Sound  Radiation  by  Unresolved  Scales  of  Motion  in 
Computational  Aeroacoustics” 

Nikolaidis,  Efstratios,  Virginia  Polytechnic  Institute  and  State  University 
“Design  Under  Uncertainty” 

Botts,  Mike,  Global  Hydrology  and  Chmate  Center,  University  of  Alabama  in  Huntsville 
“Development  and  Application  of  the  Java-based  Space-time  Toolkit  for  Integration  of 
Disparate  GeoSpatial  Dynamic  Data” 

Ristorcelli,  Raymond,  Los  Alamos  National  Laboratory 
“Lagrangian  Covariance  Analysis  of  Homogeneous  Beta-plane  Turbulence” 


December  11 


January  8 


January  13 


January  19 


January  22 


January  25 


January  28 


January  29 


February  10 


February  12 


February  25 


March  10 
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ICASE  STAFF 


I.  ADMINISTRATIVE 

Manuel  D,  Salas,  Director,  M,S.,  Aeronautics  and  Astronautics,  Polytechnic  Institute  of  Brooklyn,  1970, 
Fluid  Mechanics  and  Numerical  Analysis. 

Linda  T.  Johnson,  Office  and  Financial  Administrator 

Barbara  A.  Cardasis,  Administrative  Secretary 

Etta  M.  Morgan,  Accounting  Supervisor 

Emily  N.  Todd,  Conference  Manager/Executive  Assistant 

Shannon  K.  Verstynen,  Information  Technologist 

Gwendolyn  W.  Wesson,  Contract  Accounting  Clerk 

Shouben  Zhou,  Systems  Manager 

Peter  J.  Kearney,  Student  Assistant 

II.  SCIENCE  COUNCIL 

Lee  Beach,  Professor,  Department  of  Physics,  Computer  Science  &  Engineering,  Christopher  Newport  Uni¬ 
versity. 

Francine  Berman,  Professor,  Department  of  Computer  Science  &  Engineering,  University  of  California- San 
Diego. 

Joseph  E.  Flaherty,  Amos  Eaton  Professor,  Departments  of  Computer  Science  and  Mathematical  Sciences, 
Rensselaer  Polytechnic  Institute. 

Geoffrey  Fox,  Director,  Northeast  Parallel  Architectural  Center,  Syracuse  University. 

David  Gottlieb,  Professor,  Division  of  Applied  Mathematics,  Brown  University. 

Forrester  Johnson,  Aerodynamics  Research,  Boeing  Commercial  Airplane  Group. 

Robert  W.  MacCormack,  Professor,  Department  of  Aeronautics  and  Astronautics,  Stanford  University. 

Stanley  G.  Rubin,  Professor,  Department  of  Aerospace  Engineering  and  Engineering  Mechanics,  University 
of  Cincinnati. 
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Manuel  D.  Salas,  Director,  Institute  for  Computer  Applications  in  Science  and  Engineering,  NASA  Langley 
Research  Center. 


III.  RESEARCH  FELLOWS 

Dimitri  Mavriplis  -  Ph.D.,  Mechanical  and  Aerospace  Engineering,  Princeton  University,  1988.  Applied  & 
Numerical  Mathematics  [Grid  Techniques  for  Computational  Fluid  Dynamics].  (February  1997  to  August 
2001) 

Piyush  Mehrotra-  Ph.D.,  Computer  Science,  University  of  Virginia,  1982.  Computer  Science  [Programming 
Languages  for  Multiprocessor  Systems].  (January  1991  to  September  1999) 


IV.  CHIEF  SCIENTIST 

GeoflErey  Lilley  -  Ph.D.,  Engineering,  Imperial  College,  London,  England,  1945.  Fluid  Mechanics.  (April 
1998  to  December  1998) 


V.  SENIOR  STAFF  SCIENTISTS 

Thomas  W.  Crockett  -  B.S.,  Mathematics,  The  College  of  William  &  Mary,  1977.  Computer  Science  [System 
Software  for  Parallel  Computing,  Computer  Graphics,  and  Scientific  Visualization].  (February  1987  to 
August  2000) 

Sharath  S.  Girimaji  -  Ph.D.,  Mechanical  and  Aerospace  Engineering,  Cornell  University,  1990.  Fluid  Me¬ 
chanics  [Turbulence  and  Combustion].  (July  1993  to  March  1999) 

R.  Michael  Lewis  -  Ph.D.,  Mathematical  Sciences,  Rice  University,  1989.  Applied  &  Numerical  Mathematics 
[Multidisciplinary  Design  Optimization].  (May  1995  to  August  2000) 

Josip  Loncaric  -  Ph.D.,  Applied  Mathematics,  Harvard  University,  1985.  Applied  &  Numerical  Mathematics 
[Multidisciplinary  Design  Optimization].  (March  1996  to  August  1999) 

0 

Kwan-Liu  Ma-  Ph.D.,  Computer  Science,  University  of  Utah,  1993.  Computer  Science  [Visualization]  (May 
1993  to  August  2001) 

David  Sidilkover  -  Ph.D.,  Applied  Mathematics,  The  Weizmann  Institute  of  Science,  1989.  AppHed  & 
Numerical  Mathematics  [Numerical  Analysis  and  Algorithms].  (November  1994  to  August  1999) 


VI.  SCIENTIFIC  STAFF 

Brian  G.  Allan  -  Ph.D.,  Mechanical  Engineering,  University  of  California  at  Berkeley,  1996.  Applied  & 
Numerical  Mathematics  [Multidisciplinary  Design  Optimization].  (February  1996  to  August  1999) 
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Eyal  Arian  -  Ph.D.,  Applied  Mathematics,  The  Weizmann  Institute  of  Science,  Israel,  1995.  Applied  & 
Numerical  Mathematics  [Multidisciplinary  Design  Optimization],  (October  1994  to  August  1999) 

Po-Shu  Chen  -  Ph.D.,  Aerospace  Engineering,  University  of  Colorado-Boulder,  1997.  Physical  Sciences 
[Computational  Structures].  (January  1998  to  January  2001) 

Roger  C.  Hart  -  Ph.D.,  Physics,  University  of  Tennessee,  1991.  Physical  Sciences  [Measurement  Science  and 
Technology].  (December  1998  to  October  1999) 

Gerald  Liittgen  -  Ph.D.,  Computer  Science,  University  of  Passau,  Germany,  1998.  Computer  Science  [Formal 
Methods  Research  for  Safety  Critical  Systems].  (October  1998  to  August  2000) 

Li-Shi  Luo  -  Ph.D.,  Physics,  Georgia  Institute  of  Technology,  1993,  Computer  Science  [Parallel  Algorithms]. 
(November  1996  to  October  1999) 

Zoubeida  Ounaies  -  Ph.D.,  Engineering  Science  and  Mechanics,  The  Pennsylvania  State  University,  1995. 
Physical  Sciences  [Characterization  of  Advanced  Piezoelectric  Materials].  (March  1999  to  November  1999) 

Alexander  Povitsky  -  Ph.D.,  Mechanical  Engineering,  Moscow  Institute  of  Steel  and  Alloys  Technology 
(MISA),  Russia,  1988.  Computer  Science  [Parallelization  and  Formulation  of  Higher  Order  Schemes  for 
Aeroacoustics  Noise  Propagation].  (October  1997  to  August  2000) 


VII.  VISITING  SCIENTISTS 

Sang-Hyon  Chu  -•  Ph.D.,  Chemical  Engineering,  Seoul  National  University,  1998.  Physical  Sciences  [Smart 
Materials  and  Flow  Control],  (March  1998  to  November  1999) 

Boris  Diskin  -  Ph.D.,  Apphed  Mathematics,  The  Weizmann  Institute  of  Science,  Israel,  1998.  Teaching 
Assistant,  The  Weizmann  Institute  of  Science,  Israel.  Applied  &  Numerical  Mathematics  [Convergence 
Acceleration].  (July  1998  to  June  1999) 

Jack  Levine  -  B.S.,  Engineering  Physics,  Hofstra  University,  New  York,  1956.  NASA  Retired.  Physical 
Sciences,  (January  1999  to  April  1999) 

Alexander  Muravyov  -  Ph.D.,  Mechanical  Engineering,  University  of  British  Columbia,  Vancouver,  1997. 
Department  of  Mechanical  Engineering,  University  of  British  Columbia,  Vancouver.  Fluid  Mechanics  [Com¬ 
putational  Structural  Acoustics],  (June  1998  to  October  1998) 

Cord-Christian  Rossow  -  Ph.D.,  Aerospace  Engineering,  Technical  University  of  Braunschweig,  Germany, 
1988,  Branch  Head,  Dr.-Ing,  DLR,  Institute  of  Design  Aerodynamics,  Germany.  Applied  &  Numerical 
Mathematics.  (February  1999  to  August  1999) 

David  R,  Picasso  -  M.S.,  Management,  Stanford  University,  1993.  NASA  Retired.  Computer  Science. 
(January  1999  to  April  1999) 
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Linda  Stals  -  Ph.D.,  Mathematics,  Australian  National  University,  1996.  Post-Doc,  Department  of  Computer 
Science,  Old  Dominion  University.  Computer  Science  [Parallel  Implicit  Multilevel  Algorithms].  (November 
1998  to  October  1999) 

John  Van  Rosendale  -  Ph.D.,  Computer  Science,  University  of  IlUnois,  1980.  Program  Director  for  New 
Technologies,  Division  of  Advanced  Scientific  Computing,  National  Science  Foundation.  Computer  Science 
[Parallel  Computing].  (July  1994  to  March  1999) 

Kun  Xu  -  Ph.D.,  Astrophysics,  Columbia  University,  1993.  Assistant  Professor,  Department  of  Mathematics, 
The  Hong  Kong  University  of  Science  &  Technology,  Applied  &  Numerical  Mathematics  [Developing  Gas- 
kinetic  Schemes].  (August  1998  to  January  1999) 

Nail  K.  Yamaleev  -  Ph.D.,  Numerical  Methods  and  Mathematical  Modeling,  Moscow  Institute  of  Physics 
and  Technology,  1993.  Senior  Research  Scientist,  Department  of  Computational  Mathematics,  Institute  of 
Mathematics,  Ufa,  Russia.  Physical  Science  [Fluid  Mechanics].  (February  1999  to  September  1999) 

Ye  Zhou  -  Ph.D.,  Physics,  The  College  of  Wilfiam  &  Mary,  1987.  Department  of  Aerospace  Science  Engi¬ 
neering,  Tuskegee  University.  Fluid  Mechaics  [Lattice  Boltzmann  Method  to  Gas  Liquid  Flow].  (October 
1998  to  May  1999) 

VIII.  ASSOCIATE  RESEARCH  FELLOW 

David  E.  Keyes  -  Ph.D.,  Applied  Mathematics,  Harvard  University,  1984.  Computer  Science  [Parallel  Nu¬ 
merical  Algorithms] 


IX.  CONSULTANTS 


Saul  Abarbanel  -  Ph.D.,  Theoretical  Aerodynamics,  Massachusetts  Institute  of  Technology,  1959.  Professor, 
Department  of  Applied  Mathematics,  Tel  Aviv  University,  Israel.  Applied  &  Numerical  Mathematics  [Global 
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Siva  Thangam  -  Ph.D.,  Mechanical  Engineering,  Rutgers  University,  1980.  Professor,  Department  of  Mechan¬ 
ical  Engineering,  Stevens  Institute  of  Technology.  Fluid  Mechanics  [Turbulence  Modeling  and  Simulation] 

Virginia  Torczon  -  Ph.D.,  Mathematical  Sciences,  Rice  University,  1989.  Assistant  Professor,  Department  of 
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